Word length and the principle of least effort: language as an evolving, efficient code for information transfer
Kanwal, Jasmeen Kaur
In 1935 the linguist George Kingsley Zipf made a now classic observation about the relationship between a word’s length and its frequency: the more frequent a word is, the shorter it tends to be. He claimed that this “Law of Abbreviation” is a universal structural property of language. The Law of Abbreviation has since been documented in a wide range of human languages, and extended to animal communication systems and even computer programming languages. Zipf hypothesised that this universal design feature arises as a result of individuals optimising form-meaning mappings under competing pressures to communicate accurately but also efficiently—his famous Principle of Least Effort. In this thesis, I present a novel set of studies which provide direct experimental evidence for this explanatory hypothesis. Using a miniature artificial language learning paradigm, I show in Chapter 2 that language users optimise form-meaning mappings in line with the Law of Abbreviation only when pressures for accuracy and efficiency both operate during a communicative task. These results are robust across different methods of data collection: one version of the experiment was run in the lab, and another was run online, using a novel method I developed which allows participants to partake in dyadic interaction through a web-based interface. In Chapter 3, I address the growing body of work suggesting that a word’s predictability in context may be an even stronger determiner of its length than its frequency alone. For instance, Piantadosi et al. (2011) show that shorter words have a lower average surprisal (i.e., tend to appear in more predictive contexts) than longer words, in synchronic corpora across many languages. We hypothesise that the same communicative pressures posited by the Principle of Least Effort, when acting on speakers in situations where context manipulates the information content of words, can give rise to these lexical distributions. Adapting the methodology developed in Chapter 2, I show that participants use shorter words in more predictive contexts only when subject to the competing pressures for accurate and efficient communication. In a second experiment, I show that participants are more likely to use shorter words for meanings with a lower average surprisal. These results suggest that communicative pressures acting on individuals during language use can lead to the re-mapping of a lexicon to align with “Uniform Information Density”, the principle that information content ought to be evenly spread across an utterance, such that shorter linguistic units carry less information than longer ones. Over generations, linguistic behaviour such as that observed in the experiments reported here may bring entire lexicons into alignment with the Law of Abbreviation and Uniform Information Density. For this to happen, a diachronic process which leads to permanent lexical change is necessary. However, crucial evidence for this process—decreasing word length as a result of increasing frequency over time—has never before been systematically documented in natural language. In Chapter 4, I conduct the first large-scale diachronic corpus study investigating the relationship between word length and frequency over time, using the Google Books Ngrams corpus and three different word lists covering both English and French. Focusing on words which have both long and short variants (e.g., info/information), I show that the frequency of a word lemma may influence the rate at which the shorter variant gains in popularity. This suggests that the lexicon as a whole may indeed be gradually evolving towards greater efficiency. Taken together, the behavioural and corpus-based evidence presented in this thesis supports the hypothesis that communicative pressures acting on language-users are at least partially responsible for the frequency-length and surprisal-length relationships found universally across lexicons. More generally, the approach taken in this thesis promotes a view of language as, among other things, an evolving, efficient code for information transfer.