Learn More
This paper describes the use of a probabilistic translation model to cross-language IR (CLIR). The performance of this approach is compared with that using machine translation (MT). It is shown that using a probabilistic model, we are able to obtain performances close to those using an MT system. In addition, we also investigated the possibility of(More)
Pseudo-relevance feedback assumes that most frequent terms in the pseudo-feedback documents are useful for the retrieval. In this study, we re-examine this assumption and show that it does not hold in reality - many expansion terms identified in traditional approaches are indeed unrelated to the query and harmful to the retrieval. We also show that good(More)
Query clustering is a process used to discover frequently asked questions or most popular topics on a search engine. This process is crucial for search engines based on question-answering. Because of the short lengths of queries, approaches based on keywords are not suitable for query clustering. This paper describes a new query clustering method that makes(More)
We present a novel response generation system that can be trained end to end on large quantities of unstructured Twitter conversations. A neural network architecture is used to address sparsity issues that arise when integrating contextual information into classic statistical models, allowing the system to take into account previous dialog utterances. Our(More)
Traditional information retrieval (IR) models use bag-of-words as the basic representation and assume that some form of independence holds between terms. Representing term dependencies and defining a scoring function capable of integrating such additional evidence is theoretically and practically challenging. Recently, Quantum Theory (QT) has been proposed(More)
Query expansion has long been suggested as an effective way to resolve the short query and word mismatching problems. A number of query expansion methods have been proposed in traditional information retrieval. However, these previous methods do not take into account the specific characteristics of web searching; in particular, of the availability of large(More)
This paper presents a new dependence language modeling approach to information retrieval. The approach extends the basic language modeling approach based on unigram by relaxing the independence assumption. We integrate the linkage of a query as a hidden variable, which expresses the term dependencies within the query as an acyclic, planar, undirected graph.(More)