Learn More
The steady growth in the size of textual document collections is a key progress-driver for modern information retrieval techniques whose effectiveness and efficiency are constantly challenged. Given a user query, the number of retrieved documents can be overwhelmingly large, hampering their efficient exploitation by the user. In addition, retaining only(More)
In this paper, we use a minimal generic base of association rules between terms, in order to enrich automatically an existing ontology. Such associations of terms will enable the domain expert to enhance the existing ontology in case those terms are not already defined in the ontology. Three distance measures are defined to move closer these candidate(More)
Tweets are short messages that do not exceed 140 characters. Since they must be written respecting this limitation, a particular vocabulary is used. To make them understandable to a reader, it is therefore necessary to know their context. In this paper, we describe our approach submitted for the tweet contextualization track in CLEF 2014 (Conference and(More)
This article describes a new method to build comparable corpora from Twitter. Our strategy relies on the fact that Twitter is one of the most popular online social microblog allowing large audiences to express their thoughts and reactions about specific events or breaking news in various languages. Given two languages and a particular topic, We propose the(More)