The frequency spectrum of finite samples from the intermittent silence process
@article{FerreriCancho2009TheFS, title={The frequency spectrum of finite samples from the intermittent silence process}, author={Ramon Ferrer-i-Cancho and Ricard Gavald{\`a}}, journal={J. Assoc. Inf. Sci. Technol.}, year={2009}, volume={60}, pages={837-843} }
It has been argued that the actual distribution of word frequencies could be reproduced or explained by generating a random sequence of letters and spaces according to the so-called intermittent silence process. The same kind of process could reproduce or explain the counts of other kinds of units from a wide range of disciplines. Taking the linguistic metaphor, we focus on the frequency spectrum, i.e., the number of words with a certain frequency, and the vocabulary size, i.e., the number of…
20 Citations
Compression and the origins of Zipf's law for word frequencies
- PhysicsComplex.
- 2016
A new derivation of Zipf's law for word frequencies based on optimal coding that sheds light on the origins of other statistical laws of language and thus can lead to a compact theory of linguistic laws.
Zipf’s Law for Word Frequencies: Word Forms versus Lemmas in Long Texts
- PhysicsPloS one
- 2015
It is concluded that the exponents of Zipf’s law are very similar, despite the remarkable transformation that going from words to lemmas represents, considerably affecting all ranges of frequencies.
The origins of Zipf's meaning‐frequency law
- PsychologyJ. Assoc. Inf. Sci. Technol.
- 2018
It is shown that a single assumption on the joint probability of a word and a meaning suffices to infer Zipf's meaning‐frequency law or relaxed versions, and can be justified as the outcome of a biased random walk in the process of mental exploration.
Zipf's law revisited: Spoken dialog, linguistic units, parameters, and the principle of least effort.
- LinguisticsPsychonomic bulletin & review
- 2022
The ubiquitous inverse relationship between word frequency and word rank is commonly known as Zipf's law. The theoretical underpinning of this law states that the inverse relationship yields…
Optimization Models of Natural Communication
- Computer ScienceJ. Quant. Linguistics
- 2018
Two important components of the family, namely the information theoretic principles and the energy function that combines them linearly, are reviewed from the perspective of psycholinguistics, language learning, information theory and synergetic linguistics.
Compression and the origins of Zipf's law of abbreviation
- BiologyArXiv
- 2015
This work generalizes the information theoretic concept of mean code length as a mean energetic cost function over the probability and the magnitude of the types of the repertoire and shows that the minimization of that cost function and a negative correlation between probability andThe magnitude of types are intimately related.
Random Texts Do Not Exhibit the Real Zipf's Law-Like Rank Distribution
- MathematicsPloS one
- 2010
It is suggested that Zipf's law might in fact be a fundamental law in natural languages because it is demonstrated that ranks derived from random texts and ranksderived from real texts are statistically inconsistent with the parameters employed to argue for such a good fit, even when the parameters are inferred from the target real text.
Information content versus word length in random typing
- Computer ScienceArXiv
- 2012
The relationship between the measure and word length is studied for the popular random typing process where a text is constructed by pressing keys at random from a keyboard containing letters and a space behaving as a word delimiter.
A paradoxical property of the monkey book
- PhysicsArXiv
- 2011
The somewhat counter-intuitive conclusion is that a 'monkey book' obeys Heaps' power law precisely because its word-frequency distribution is not a smoothPower law, contrary to the expectation based on simple mathematical arguments that if one is a power law, so is the other.
Compression as a Universal Principle of Animal Behavior
- BiologyCogn. Sci.
- 2013
It is shown that minimizing the expected code length implies that the length of a word cannot increase as its frequency increases, which means that the mean code length or duration is significantly small in human language, and also in the behavior of other species in all cases where agreement with the law of brevity has been found.
References
SHOWING 1-10 OF 26 REFERENCES
Numerical Analysis of Word Frequencies in Artificial and Natural Language Texts
- Linguistics
- 1997
We perform a numerical study of the statistical properties of natural texts written in English and of two types of artificial texts. As statistical tools we use the conventional Zipf analysis of the…
Zipf's Law and Random Texts
- LinguisticsAdv. Complex Syst.
- 2002
It is shown that real texts fill the lexical spectrum much more efficiently and regardless of the word length, suggesting that the meaningfulness of Zipf's law is high.
Zipf's law from a communicative phase transition
- Computer Science
- 2005
It is supported that Zipf's law in a communication system may maximize the information transfer under constraints and be specially suitable for the speech of schizophrenics.
Hierarchical structures induce long-range dynamical correlations in written texts.
- Computer ScienceProceedings of the National Academy of Sciences of the United States of America
- 2006
It is concluded that hierarchical structures in text serve to create long-range correlations, and use the reader's memory in reenacting some of the multidimensionality of the thoughts being expressed.
The appropriate use of Zipf's law in animal communication studies
- Computer ScienceAnimal Behaviour
- 2005
On the law of Zipf-Mandelbrot for multi-word phrases
- Mathematics
- 1999
This article studies the probabilities of the occurrence of multi-word (m-word) phrases (m = 2,3,... ) in relation to the probabilities of occurrence of the single words. It is well known that, in…
Random texts exhibit Zipf's-law-like word frequency distribution
- MathematicsIEEE Trans. Inf. Theory
- 1992
It is shown that the distribution of word frequencies for randomly generated texts is very similar to Zipf's law observed in natural languages such as English. The facts that the frequency of…
Finitary models of language users
- Computer Science
- 1963
It is proposed to describe talkers and listeners to describe the users of language rather than the language itself, just as the authors' knowledge of arithmetic is not merely the collection of their arithmetic responses, habits, or dispositions.
The Frequency Spectrum of Text and Vocabulary
- LinguisticsJ. Quant. Linguistics
- 1996
Some problems of the analysis of the word‐frequency distribution and the possibility of its analytical description are dealt with.