Unsupervised Segmentation of Words Using Prior Distributions of Morph Length and Frequency

@inproceedings{Creutz2003UnsupervisedSO,
  title={Unsupervised Segmentation of Words Using Prior Distributions of Morph Length and Frequency},
  author={Mathias Creutz},
  booktitle={ACL},
  year={2003}
}
We present a language-independent and unsupervised algorithm for the segmentation of words into morphs. The algorithm is based on a new generative probabilistic model, which makes use of relevant prior information on the length and frequency distributions of morphs in a language. Our algorithm is shown to outperform two competing algorithms, when evaluated on data from a language with agglutinative morphology (Finnish), and to perform well also on English data. 
95 Citations
Induction of a Simple Morphology for Highly-Inflecting Languages
  • 80
  • PDF
The Study of Effect of Length in Morphological Segmentation of Agglutinative Languages
  • 3
  • PDF
High-Performance, Language-Independent Morphological Segmentation
  • 73
  • PDF
Unsupervised morphological parsing of Bengali
  • 66
  • PDF
Unsupervised morpheme segmentation in a non-parametric Bayesian framework
  • PDF
Unsupervised Learning of Morphology
  • 127
  • PDF
INDUCING THE MORPHOLOGICAL LEXICON OF A NATURAL LANGUAGE FROM UNANNOTATED TEXT
  • 218
  • PDF
Unsupervised Learning of Morphology by using Syntactic Categories
  • 18
  • PDF
Unsupervised Acquiring of Morphological Paradigms from Tokenized Text
  • 23
  • PDF
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 22 REFERENCES
Unsupervised Discovery of Morphemes
  • 357
  • PDF
Unsupervised Learning of the Morphology of a Natural Language
  • 805
  • Highly Influential
  • PDF
Unsupervised Learning of Morphology Without Morphemes
  • 76
  • PDF
Unsupervised Learning of Morphology Using a Novel Directed Search Algorithm: Taking the First Step
  • 51
  • PDF
A Bayesian Model for Morpheme and Paradigm Identification
  • 52
  • PDF
An Efficient, Probabilistically Sound Algorithm for Segmentation and Word Discovery
  • M. Brent
  • Computer Science
  • Machine Learning
  • 2004
  • 305
  • PDF
Unsupervised discovery of morphologically related words based on orthographic and semantic similarity
  • 119
  • PDF
A General Computational Model For Word-Form Recognition And Production
  • 628
  • PDF
UNSUPERVISED WORD INDUCTION USING MDL CRITERION
  • 10
  • PDF
UNSUPERVISED WORD INDUCTION USING MDL CRITERION
  • 14
...
1
2
3
...