Learn More
This article describes two systems participating to the TweetLID-2014 competition focused on language detection in tweets. The systems are based on two different strategies: ranked dictionaries and Naive Bayes classifiers. The results show that ranking dictionaries performs better with small training corpora whose language distribution is similar to that of(More)
In this paper, we describe the compilation and structure of two linguistic resources, a corpus and a dictionary of terms of the field of economy, developed for Galician. Together with this, we describe the use of these resources for the automatical extraction of multi-word terms by means of a combination of linguistic and statistical techniques. While doing(More)
Resumen: Este artículo propone un método de extracción de equivalentes léxicos de traducción a partir de corpora paralelos alineados a niveles superiores a la oración y etiquetados morfosintácticamente. La estrategia que seguiremos toma en cuenta el efecto discriminante y desambiguador que el contexto local ejerce sobre un elemento Abstract: The paper(More)
This paper describes the language components of a system for Document Routing in Spanish. The system identifies relevant terms for classification within involved documents by means of natural language processing techniques. These techniques are based on the isolation and normalization of syntactic unities considered relevant for the classification,(More)
  • 1