• Corpus ID: 9820866

Knowledge-poor Approach to Constructing Word Frequency Lists, with Example from Romance Languages

@article{Alexandrov2004KnowledgepoorAT,
  title={Knowledge-poor Approach to Constructing Word Frequency Lists, with Example from Romance Languages},
  author={Mikhail Alexandrov and Xavier Blanco and Alexander Gelbukh and Pavel P. Makagonov},
  journal={Proces. del Leng. Natural},
  year={2004},
  volume={33}
}
Las listas de palabras con sus frecuencias se usan ampliamente en muchos procedimientos de agrupamiento y categorizacion de textos. Usualmente para la compilacion de tales listas se usan las aproximaciones basadas en morfologia (como el stemmer de Porter) para unir las palabras con el mismo significado. Desafortunadamente, tales aproximaciones requieren de muchos recursos linguisticos dependientes de lenguaje cuando se trabaja con datos multilingues y colecciones multitematicas de documentos… 
Detecting Inflection Patterns in Natural Language by Minimization of Morphological Model
TLDR
An unsupervised method of recognition of inflection patterns automatically, with no a priori information on the given language, basing exclusively on a list of words ex- tracted from a large text, shows promising results on different European languages.
Semantic Hyper-graph Based Representation of Nouns in the Kazakh Language
Abstrac t. We explain how semantic hyper-graphs are used to describe ontological models of morphological rules of agglutinative languages, with the Kazakh language as a case study. The vertices of

References

SHOWING 1-7 OF 7 REFERENCES
Testing Word Similarity: Language Independent Approach with Examples from Romance
TLDR
This paper considers a set of models (formulae) of a given class and selects the best ones using training and test samples and demonstrates how to construct such formulae for a given language using an inductive method of model self-organization.
Empirical Formula for Testing Word Similarity and Its Application for Constructing a Word Frequency List
TLDR
This work proposes a heuristic approximate method for identifying strings resulting from morphological variation of the same base meaning based on an empirical formula for testing the similarity of two words using large morphological dictionaries.
An algorithm for suffix stripping
TLDR
An algorithm for suffix stripping is described, which has been implemented as a short, fast program in BCPL and performs slightly better than a much more elaborate system with which it has been compared.
Approach to Construction of Automatic Morphological Analysis Systems for Inflective Languages with Little Effort
Development of morphological analysis systems for inflective languages is a tedious and laborious task. We suggest an approach for development of such systems that permits to spend less time and
Exact and Approximate Prefix Search under Access Locality Requirements for Morphological Analysis and Spelling Correction
TLDR
A data structure useful for prefix search in a very LARGE DICTIONARY with an limited querying string is discussed, with applications to MORPHOLOGICAL ANALYSIS and SPELLing Correction.
Modern Information Retrieval
  • 1999
Manual on typical algorithms of modeling
  • Tehnika Publ., Kiev (in Russian)
  • 1980