Josiane Mothe

Learn More
Query difficulty can be linked to a number of causes. Some of these causes can be related to the query expression itself, and can therefore be detected through a linguistic analysis of the query text. Using 16 different linguistic features, automatically computed on TREC queries, we looked for significant correlations between these features and the average(More)
This paper presents a novel user interface that provides global visualisations of large document sets in order to help users to formulate the query that corresponds to their information needs and to access the corresponding documents. An important element of the approach we introduce is the use of concept hierarchies (CHs) in order to structure the document(More)
The use case of the Tweet Contextualization task is the following: given a new tweet, participating systems must provide some context about the subject of a tweet, in order to help the reader to understand it. In this task, contextualizing tweets consists in answering questions of the form “what is this tweet about?” which can be answered by several(More)
To evaluate Information Retrieval Systems on their effectiveness, evaluation programs such as TREC offer a rigorous methodology as well as benchmark collections. Whatever the evaluation collection used, effectiveness is generally considered globally, averaging the results over a set of information needs. As a result, the variability of system performance is(More)
Twitter is increasingly used for on-line client and audience fishing, this motivated the tweet contextualization task at INEX. The objective is to help a user to understand a tweet by providing him with a short summary (500 words). This summary should be built automatically using local resources like the Wikipedia and generated by extracting relevant(More)
CLEF Cultural micro-blog Contextualization Workshop is aiming at providing the research community with data sets to gather, organize and deliver relevant social data related to events generating a large number of micro-blog posts and web documents. It is also devoted to discussing tasks to be run from this data set and that could serve applications.
Feature selection in learning to rank has recently emerged as a crucial issue. Whereas several preprocessing approaches have been proposed, only a few have focused on integrating feature selection into the learning process. In this paper, we propose a general framework for feature selection in learning to rank using support vector machines with a sparse(More)
This paper introduces a new approach to provide users with solutions to explore a domain via an information space. A key point in our approach is that information searching and exploring takes place in a domaindependent semantic context. A given context is described through its vocabulary organised along hierarchies that structure the information space.(More)