Learn More
We demonstrate a substantial improvement on one of the most celebrated empirical laws in the study of language, Zipf's 75-y-old theory that word length is primarily determined by frequency of use. In accord with rational theories of communication, we show across 10 languages that average information content is a much better predictor of word length than(More)
We present a general information-theoretic argument that all efficient communication systems will be ambiguous, assuming that context is informative about meaning. We also argue that ambiguity allows for greater ease of processing by permitting efficient linguistic units to be re-used. We test predictions of this theory in English, German, and Dutch. Our(More)
We present the results of a large-scale web experiment investigating comprehenders' ability to guess upcoming referents in an unfolding discourse. Participants were given a text that had been cut off just before a noun phrase, and attempted to guess which previously mentioned referent, if any, would be mentioned next. Our results show that writers are more(More)
Functionalist typologists have long argued that pressures associated with language usage influence the distribution of grammatical properties across the world's languages. Specifically, grammatical properties may be observed more often across languages because they improve a language's utility or decrease its complexity. While this approach to the study of(More)
We present a compendium of recent and current projects that utilize crowdsourcing technologies for language studies, finding that the quality is comparable to controlled laboratory experiments, and in some cases superior. While crowdsourcing has primarily been used for annotation in recent language studies, the results here demonstrate that far richer data(More)
Human infants and adults are able to segment coherent sequences from unsegmented strings of auditory stimuli after only a short exposure, an ability thought to be linked to early language acquisition. Although some research has hypothesized that learners succeed in these tasks by computing transitional probabilities between syllables, current experimental(More)
A small number of the logically possible word order configurations account for a large proportion of actual human languages. To explain this distribution, typologists often invoke principles of human cognition which might make certain orders easier or harder to learn or use. We present a novel method for carrying out very large scale artificial language(More)
Online sentence comprehension involves multiple types of cognitive processes: lexical processes such as lexical access, which call on the user's knowledge of the meaning of words in the language, and structural processes such as the integration of incoming words into an emerging representation. In this article, we investigate the temporal dynamics of(More)
First, we disagree with Reilly and Kean (1) that our results on word length (2) contradicted Zipf's principle of least effort. Our findings were in the same spirit, except that we measured effort in a more principled way than Zipf could have (2). Assigning word length by information content is least effort under an assumption of a superlinear relationship(More)
Recently, Ferrer i Cancho and Moscoso del Prado Martín (2011) argued that an observed linear relationship between word length and average surprisal (Piantadosi, Tily, & Gibson, 2011) is not evidence for communicative efficiency in human language. We discuss several shortcomings of their approach and critique: their model critically rests on inaccurate(More)
  • 1