# Random texts exhibit Zipf's-law-like word frequency distribution

@article{Li1992RandomTE, title={Random texts exhibit Zipf's-law-like word frequency distribution}, author={Wentian Li}, journal={IEEE Trans. Inf. Theory}, year={1992}, volume={38}, pages={1842-1845} }

It is shown that the distribution of word frequencies for randomly generated texts is very similar to Zipf's law observed in natural languages such as English. The facts that the frequency of occurrence of a word is almost an inverse power law function of its rank and the exponent of this inverse power law is very close to 1 are largely due to the transformation from the word's length to its rank, which stretches an exponential function to a power law function. >

## 505 Citations

### Zipf's Law and Random Texts

- LinguisticsAdv. Complex Syst.
- 2002

It is shown that real texts fill the lexical spectrum much more efficiently and regardless of the word length, suggesting that the meaningfulness of Zipf's law is high.

### Minimal models for text production and Zipf's law

- PhysicsInternational Conference on Integration of Knowledge Intensive Multi-Agent Systems, 2005.
- 2005

It is shown that when interaction is taken into account by allowing the words to compete amongst themselves for space in the memory of the users, the resulting word frequency distribution is best described by an exponential, rather than by a power-law.

### Zipf's law of abbreviation as a language universal

- Linguistics
- 2016

It is argued that this universal trend of words that are used more frequently tend to be shorter is likely to derive from fundamental principles of information processing and transfer.

### Zipf's law against the text size: a half-rational model

- MathematicsGlottometrics
- 2002

A simple model of dependence of Zipf-Mandelbrot law on the text size is presented, which is featured by variable power-law tail and constant ratio of the most frequent words.

### Random Texts Do Not Exhibit the Real Zipf's Law-Like Rank Distribution

- MathematicsPloS one
- 2010

It is suggested that Zipf's law might in fact be a fundamental law in natural languages because it is demonstrated that ranks derived from random texts and ranksderived from real texts are statistically inconsistent with the parameters employed to argue for such a good fit, even when the parameters are inferred from the target real text.

### Zipf’s Law for Word Frequencies: Word Forms versus Lemmas in Long Texts

- PhysicsPloS one
- 2015

It is concluded that the exponents of Zipf’s law are very similar, despite the remarkable transformation that going from words to lemmas represents, considerably affecting all ranges of frequencies.

### Compression and the origins of Zipf's law for word frequencies

- PhysicsComplex.
- 2016

A new derivation of Zipf's law for word frequencies based on optimal coding that sheds light on the origins of other statistical laws of language and thus can lead to a compact theory of linguistic laws.

### Algorithmic information, complexity and Zipf's law

- Computer ScienceGlottometrics
- 2002

It is found that natural languages have maximum complexity and it is argued that random text models are unsuitable for natural languages.

### A Simple LNRE Model for Random Character Sequences

- Computer Science
- 2004

The model, which has convenient analytical and numerical properties, is shown to be adequate for the description of language data extracted by automatic means from large text corpora and can be used to study the problems faced by the statistical analysis of such data in the field of natural-language processing.

### Zipf's Law and Avoidance of Excessive Synonymy

- PsychologyCogn. Sci.
- 2008

It is suggested that Zipf's law may result from a hierarchical organization of word meanings over the semantic space, which in turn is generated by the evolution of word semantics dominated by expansion of meanings and competition of synonyms.

## References

SHOWING 1-10 OF 17 REFERENCES

### Mutual Information Functions of Natural Language Texts

- Computer Science
- 1989

Although the analysis presented in this paper depends on the concepts in information theory, the emphasis is on the correlation be-tween two letters separated by, which is the inverse Fourier transformation of the power spectrum.

### Fractal Geometry of Nature

- Art
- 1977

This book is a blend of erudition, popularization, and exposition, and the illustrations include many superb examples of computer graphics that are works of art in their own right.

### Intermittency, self-similarity and 1/f spectrum in dissipative dynamical systems

- Chemistry
- 1980

Nous etudions un systeme dynamique dissipatif discret qui presente une transition vers la turbulence par intermittence. Au seuil d'instabilite, ce modele possede une structure d'homothetie interne…

### The Fractal Geometry of Nature (Freeman

- 1982); Fractals: Form, Chance and Dimension (Freeman, 1977); Les objects fractal: forme, hasard et dimension
- 1975

### Selective Studies and the Principle of Relative Frequency in Language (Cambridge

- Mass, 1932); Human Behavior and the Principle of Least-Effort (Cambridge, Mass, 1949; Addison- Wesley, 1965); The Psycho-biology of Language: An Introduction to Dynamic Philology
- 1965

### The Fractal Geometry of Nature (Freeman, 1982); Fractals: Form, Chance and Dimension (Freeman, 1977); Les objects fractal: forme, hasard et dimension (Flammarion

- 1975

### Raimi , " The peculiar distribution of first digits Manneville , " Intermittency , self - similarity and 1 / f spectrum in dissipative dynamical systems

- Le Journal De Physique
- 1953

### Selective Studies and the Principle of Relative Frequency in Language

- Human Behavior and the Principle of Least-Effort
- 1932