An Example of Statistical Investigation of the Text Eugene Onegin Concerning the Connection of Samples in Chains

@article{Markov2006AnEO,
  title={An Example of Statistical Investigation of the Text Eugene Onegin Concerning the Connection of Samples in Chains},
  author={Alexander A Markov},
  journal={Science in Context},
  year={2006},
  volume={19},
  pages={591 - 600}
}
  • A. Markov
  • Published 1 December 2006
  • Art
  • Science in Context
This study investigates a text excerpt containing 20,000 Russian letters of the alphabet, excluding $\Cprime$ and $\Cdprime$, from Pushkin's novel Eugene Onegin–the entire first chapter and sixteen stanzas of the second. 
Speech and Language Processing. Hidden Markov Models
TLDR
This chapter introduces a descendant of Markov's model that is a key model for language processing, the hidden Markov model or HMM, a sequence model whose job is to assign a label or class to each unit in a sequence, and presents the mathematics of the HMM.
Letter counting: a stem cell for Cryptology, Quantitative Linguistics, and Statistics
TLDR
The eclectism of past centuries scholars, their background in humanities, and their familiarity with cryptograms, are identified as contributing factors to the mutual enrichment process which is described here.
Computer-Generated Books: Metonymic, Metaphoric and Operationalist
In Gulliver’s travels, the professor quoted above has devised a mechanical means for creating speculative knowledge and proposes that merely 500 or so of these stochastic frames would be necessary to
University of Birmingham Slavic computational and corpus linguistics
In this paper, we focus on corpus-linguistic studies that address theoretical questions and on computational linguistic work on corpus annotation, that makes corpora useful for linguistic work.
"ON ALTHUSSER’s PHILOSOPHY OF THE ENCOUNTER"
Abstract: The article reviews Althusser's Philosophy of the Encounter, examining in turn the problem of the Epistemological Break and the idea of materialisme aleatoire. It looks at Althusser's
Slavic Corpus and Computational Linguistics
TLDR
Why the corpus linguistic approach was discredited by generative linguists in the second half of the 20th century is discussed, how it made a comeback through advances in computing and was finally adopted by usage-based linguistics at the beginning of the 21st century.
Rethinking language: How probabilities shape the words we use
  • T. Griffiths
  • Computer Science
    Proceedings of the National Academy of Sciences
  • 2011
TLDR
The paper by Piantadosi et al. (9) adds to this literature, using probabilistic models estimated from large databases to update a classic result about the length of words.
Entropy of Sounds: Sonnets to Battle Rap
TLDR
The conditional entropy of sequences of phonological patterns in lyrics is compared and it is found that, in general, Battle Rap and Sonnets maintain noticeably lower entropy than other genres across sequence sizes, while lyrics from Electronic music and Hip-Hop display relatively high entropy.
Learning Language
This is an edited version of two final lectures from my course on Large Spaces at the Courant institute in the Fall 2019.
...
...