José Ramom Pichel Campos

Learn More
This article describes two systems participating to the TweetLID-2014 competition focused on language detection in tweets. The systems are based on two different strategies: ranked dictionaries and Naive Bayes classifiers. The results show that ranking dictionaries performs better with small training corpora whose language distribution is similar to that of(More)
So far, research on extraction of translation equivalents from comparable, non-parallel corpora has not been very popular. The main reason was the poor results when compared to those obtained from aligned parallel corpora. The method proposed in this paper, relying on seed patterns generated from external bilingual dictionaries, allows us to achieve similar(More)
imaxin|software levamos a cabo um projecto, subsidiado pola Dirección Xeral de I+D+i da Xunta de Galicia, cha-mado " RecursOpentrad: Recursos lingüístico-This work is licensed under a Creative Commons Attribution 3.0 License
So far, research on extraction of word translations from comparable, non-parallel corpora has not been very popular. The main reason was the poor results when compared to those obtained from aligned parallel corpora. The method proposed in this paper, relying on seed contexts generated from external bilingual dictionaries, allows us to achieve results(More)
Resumen: Este artículo describe una estrategia de normalización léxica de pal-abras " out-of-vocabulary " (OOV) en tweets escritos en español. Para corregir OOV incorrectos, el sistema de normalización genera candidatos " in-vocabulary " (IV) que aparecen en diferentes recursos léxicos y selecciona el más adecuado. Nuestro Abstract: This paper describes a(More)