# Approaches to the classification of complex systems: Words, texts, and more

@article{Rovenchak2022ApproachesTT,
title={Approaches to the classification of complex systems: Words, texts, and more},
author={Andrij A. Rovenchak},
journal={ArXiv},
year={2022},
volume={abs/2205.04060}
}
The Chapter starts with introductory information about quantitative linguistics notions, like rank–frequency dependence, Zipf’s law, frequency spectra, etc. Similarities in distributions of words in texts with level occupation in quantum ensembles hint at a superficial analogy with statistical physics. This enables one to define various parameters for texts based on this physical analogy, including “temperature”, “chemical potential”, entropy, and some others. Such parameters provide a set of…

## References

SHOWING 1-10 OF 171 REFERENCES
Zipf’s word frequency law in natural language: A critical review and future directions
It is shown that human language has a highly complex, reliable structure in the frequency distribution over and above Zipf’s law, although prior data visualization methods have obscured this fact.
On the similarity of symbol frequency distributions with heavy tails
• Computer Science
ArXiv
• 2015
It is found that frequent words change more slowly than less frequent words and that $\alpha=2$ provides the most robust measure to quantify language change, a complete $\alpha$-spectrum of measures.
Part-of-Speech Sequences in Literary Text: Evidence From Ukrainian
• Mathematics
J. Quant. Linguistics
• 2018
It is shown that Zipf’s law holds for parts-of-speech sequences in Ukrainian texts by Ivan Franko, and it is expected that further studies of the proposed PoSW units both in Ukrainian and other languages can reveal new features of texts on the sentence and supra-sentence levels.
Systematic analysis of coding and noncoding DNA sequences using methods of statistical linguistics.
• Biology
Physical review. E, Statistical physics, plasmas, fluids, and related interdisciplinary topics
• 1995
It is found that for vertebrates such as primates and rodents and for viral DNA, the difference between the statistical properties of coding and noncoding regions is not pronounced and therefore the results of the analyses of the investigated sequences are less conclusive.
Random Texts Do Not Exhibit the Real Zipf's Law-Like Rank Distribution
• Mathematics
PloS one
• 2010
It is suggested that Zipf's law might in fact be a fundamental law in natural languages because it is demonstrated that ranks derived from random texts and ranksderived from real texts are statistically inconsistent with the parameters employed to argue for such a good fit, even when the parameters are inferred from the target real text.
Two Regimes in the Frequency of Words and the Origins of Complex Lexicons: Zipf’s Law Revisited*
• Physics
J. Quant. Linguistics
• 2001
It is made evident that word frequency as a function of the rank follows two different exponents, ˜(-)1 for the first regime and ™(-)2 for the second.
DEFINING THERMODYNAMIC PARAMETERS FOR TEXTS FROM WORD RANK-FREQUENCY DISTRIBUTIONS
• Linguistics
• 2011
We report the results regarding the calculation of a new parameter set obtained from the rank–frequency distribution of texts. The parameters are defined using the analogy between the rank–frequency
Zipf's Law and Random Texts
• Linguistics