Knowledge-poor Approach to Constructing Word Frequency Lists, with Example from Romance Languages
@article{Alexandrov2004KnowledgepoorAT, title={Knowledge-poor Approach to Constructing Word Frequency Lists, with Example from Romance Languages}, author={Mikhail Alexandrov and Xavier Blanco and Alexander Gelbukh and Pavel P. Makagonov}, journal={Proces. del Leng. Natural}, year={2004}, volume={33} }
Las listas de palabras con sus frecuencias se usan ampliamente en muchos procedimientos de agrupamiento y categorizacion de textos. Usualmente para la compilacion de tales listas se usan las aproximaciones basadas en morfologia (como el stemmer de Porter) para unir las palabras con el mismo significado. Desafortunadamente, tales aproximaciones requieren de muchos recursos linguisticos dependientes de lenguaje cuando se trabaja con datos multilingues y colecciones multitematicas de documentos…
2 Citations
Detecting Inflection Patterns in Natural Language by Minimization of Morphological Model
- Computer ScienceCIARP
- 2004
An unsupervised method of recognition of inflection patterns automatically, with no a priori information on the given language, basing exclusively on a list of words ex- tracted from a large text, shows promising results on different European languages.
Semantic Hyper-graph Based Representation of Nouns in the Kazakh Language
- Computer ScienceComputación y Sistemas
- 2014
Abstrac t. We explain how semantic hyper-graphs are used to describe ontological models of morphological rules of agglutinative languages, with the Kazakh language as a case study. The vertices of…
References
SHOWING 1-7 OF 7 REFERENCES
Testing Word Similarity: Language Independent Approach with Examples from Romance
- Computer ScienceNLDB
- 2004
This paper considers a set of models (formulae) of a given class and selects the best ones using training and test samples and demonstrates how to construct such formulae for a given language using an inductive method of model self-organization.
Empirical Formula for Testing Word Similarity and Its Application for Constructing a Word Frequency List
- LinguisticsCICLing
- 2002
This work proposes a heuristic approximate method for identifying strings resulting from morphological variation of the same base meaning based on an empirical formula for testing the similarity of two words using large morphological dictionaries.
An algorithm for suffix stripping
- LinguisticsProgram
- 1980
An algorithm for suffix stripping is described, which has been implemented as a short, fast program in BCPL and performs slightly better than a much more elaborate system with which it has been compared.
Approach to Construction of Automatic Morphological Analysis Systems for Inflective Languages with Little Effort
- Computer ScienceCICLing
- 2003
Development of morphological analysis systems for inflective languages is a tedious and laborious task. We suggest an approach for development of such systems that permits to spend less time and…
Exact and Approximate Prefix Search under Access Locality Requirements for Morphological Analysis and Spelling Correction
- Computer ScienceComputación y Sistemas
- 2003
A data structure useful for prefix search in a very LARGE DICTIONARY with an limited querying string is discussed, with applications to MORPHOLOGICAL ANALYSIS and SPELLing Correction.
Modern Information Retrieval
- 1999
Manual on typical algorithms of modeling
- Tehnika Publ., Kiev (in Russian)
- 1980