Learn More
We demonstrate a substantial improvement on one of the most celebrated empirical laws in the study of language, Zipf's 75-y-old theory that word length is primarily determined by frequency of use. In accord with rational theories of communication, we show across 10 languages that average information content is a much better predictor of word length than(More)
We present a general information-theoretic argument that all efficient communication systems will be ambiguous, assuming that context is informative about meaning. We also argue that ambiguity allows for greater ease of processing by permitting efficient linguistic units to be re-used. We test predictions of this theory in English, German, and Dutch. Our(More)
Functionalist typologists have long argued that pressures associated with language usage influence the distribution of grammatical properties across the world's languages. Specifically, grammatical properties may be observed more often across languages because they improve a language's utility or decrease its complexity. While this approach to the study of(More)
We present a compendium of recent and current projects that utilize crowdsourcing technologies for language studies, finding that the quality is comparable to controlled laboratory experiments, and in some cases superior. While crowdsourcing has primarily been used for annotation in recent language studies, the results here demonstrate that far richer data(More)
A small number of the logically possible word order configurations account for a large proportion of actual human languages. To explain this distribution, typologists often invoke principles of human cognition which might make certain orders easier or harder to learn or use. We present a novel method for carrying out very large scale artificial language(More)
Recently, Ferrer i Cancho and Moscoso del Prado Martín (2011) argued that an observed linear relationship between word length and average surprisal (Piantadosi, Tily, & Gibson, 2011) is not evidence for communicative efficiency in human language. We discuss several shortcomings of their approach and critique: their model critically rests on inaccurate(More)
Online sentence comprehension involves multiple types of cognitive processes: lexical processes such as lexical access, which call on the user's knowledge of the meaning of words in the language, and structural processes such as the integration of incoming words into an emerging representation. In this article, we investigate the temporal dynamics of(More)
  • 1