Share This Author
A Comparison Between Morphological Complexity Measures: Typological Data vs. Language Corpora
This paper uses human expert judgements from the World Atlas of Language Structures (WALS) to compare them to four quantitative measures automatically calculated from language corpora, and finds strong correlations between all the measures.
A Quantitative Empirical Analysis of the Abstract/Concrete Distinction
This study presents original evidence that abstract and concrete concepts are organized and represented differently in the mind, based on analyses of thousands of concepts in publicly available data…
Adaptive Communication: Languages with More Non-Native Speakers Tend to Have Fewer Word Forms
- C. Bentz, Annemarie Verkerk, Douwe Kiela, Felix Hill, P. Buttery, M. Aronoff
- LinguisticsPloS one
- 17 June 2015
It is argued that languages are information encoding systems shaped by the varying needs of their speakers, and that languages with greater levels of contact typically employ fewer word forms to encode the same information content.
Zipf's law and the grammar of languages: A quantitative study of Old and Modern English parallel texts
A quantitative analysis of the relationship between word frequency distributions and morphological features in languages suggests that the syntheticity of the language in these texts can be captured mathematically, a property the authors tentatively call their grammatical fingerprint.
Languages with More Second Language Learners Tend to Lose Nominal Case
The negative association between the number of second language speakers and nominal casecomplexity generalizestodifferentlanguageareas andfamilies and the idea that morphosyntactic complexity is reduced by a high degree of language contact involving adult learners is supported.
The Entropy of Words - Learnability and Expressivity across More than 1000 Languages
- C. Bentz, Dimitrios Alikaniotis, Michael Cysouw, R. Ferrer-i-Cancho
- Computer ScienceEntropy
- 14 June 2017
The choice associated with words is a fundamental property of natural languages. It lies at the heart of quantitative linguistics, computational linguistics and language sciences more generally.…
Zipf's law of abbreviation as a language universal
It is argued that this universal trend of words that are used more frequently tend to be shorter is likely to derive from fundamental principles of information processing and transfer.
Compression and the origins of Zipf's law of abbreviation
This work generalizes the information theoretic concept of mean code length as a mean energetic cost function over the probability and the magnitude of the types of the repertoire and shows that the minimization of that cost function and a negative correlation between probability andThe magnitude of types are intimately related.
Variation in Word Frequency Distributions: Definitions, Measures and Implications for a Corpus-Based Language Typology
- C. Bentz, Dimitrios Alikaniotis, T. Samardzic, P. Buttery
- LinguisticsJ. Quant. Linguistics
- 17 January 2017
It is argued that quantitative measures like the NFD can advance language typology beyond abstract, theory-driven expert judgments, towards more corpus-based, empirical and reproducible analyses.