A Discriminative Semantic Ranker for Question Retrieval

  title={A Discriminative Semantic Ranker for Question Retrieval},
  author={Yinqiong Cai and Yixing Fan and Jiafeng Guo and Ruqing Zhang and Yanyan Lan and Xueqi Cheng},
  journal={Proceedings of the 2021 ACM SIGIR International Conference on Theory of Information Retrieval},
  • Yinqiong Cai, Yixing Fan, +3 authors Xueqi Cheng
  • Published 2021
  • Computer Science
  • Proceedings of the 2021 ACM SIGIR International Conference on Theory of Information Retrieval
Similar question retrieval is a core task in community-based question answering (CQA) services. To balance the effectiveness and efficiency, the question retrieval system is typically implemented as multi-stage rankers: The first-stage ranker aims to recall potentially relevant questions from a large repository, and the latter stages attempt to re-rank the retrieved results. Most existing works on question retrieval mainly focused on the re-ranking stages, leaving the first-stage ranker to some… Expand

Figures and Tables from this paper

Semantic Models for the First-stage Retrieval: A Comprehensive Review
The current landscape of the firststage retrieval models under a unified framework is described to clarify the connection between classical term-based retrieval methods, early semantic retrieved methods and neural semantic retrieval methods. Expand


A syntactic tree matching approach to finding similar questions in community-based qa services
This paper proposes a new retrieval framework based on syntactic tree structure to tackle the similar question matching problem and builds a ground-truth set from Yahoo! Answers, and experimental results show that the method outperforms traditional bag-of-word or tree kernel based methods. Expand
FAQ Retrieval Using Attentive Matching
This paper proposes multiple deep learning architectures designed for FAQ Retrieval that eliminate the need for feature engineering and are able to elegantly combine both query-question and query-answer similarities. Expand
Efficient Cost-Aware Cascade Ranking in Multi-Stage Retrieval
This paper presents a novel approach to optimizing cascaded ranking models which can directly leverage a variety of different state-of-the-art LTR rankers such as LambdaMART and Gradient Boosted Decision Trees. Expand
MatchΒ²: A Matching over Matching Model for Similar Question Identification
This work proposes a novel matching over matching model, namely Match2, which compares the matching patterns between two question-answer pairs for similar question identification and demonstrates that this model can significantly outperform previous state-of-the-art methods on the similar questions identification task. Expand
From Neural Re-Ranking to Neural Ranking: Learning a Sparse Representation for Inverted Indexing
The results demonstrate the importance of sparsity in neural IR models and show that dense representations can be pruned effectively, giving new insights about essential semantic features and their distributions. Expand
Phrase-Based Translation Model for Question Retrieval in Community Question Answer Archives
Experiments demonstrate that the proposed phrase-based translation model significantly outperforms the state-of-the-art word-basedtranslation model for question retrieval. Expand
Learning deep structured semantic models for web search using clickthrough data
A series of new latent semantic models with a deep structure that project queries and documents into a common low-dimensional space where the relevance of a document given a query is readily computed as the distance between them are developed. Expand
A Dual Embedding Space Model for Document Ranking
The proposed Dual Embedding Space Model (DESM) captures evidence on whether a document is about a query term in addition to what is modelled by traditional term-frequency based approaches, and shows that the DESM can re-rank top documents returned by a commercial Web search engine, like Bing, better than a term-matching based signal like TF-IDF. Expand
End-to-End Retrieval in Continuous Space
This work considers the problem of end-to-end continuous retrieval, where standard approximate nearest neighbor (ANN) search replaces the usual discrete inverted index, and rely entirely on distances between learned embeddings. Expand
Learning to rank for information retrieval
Three major approaches to learning to rank are introduced, i.e., the pointwise, pairwise, and listwise approaches, the relationship between the loss functions used in these approaches and the widely-used IR evaluation measures are analyzed, and the performance of these approaches on the LETOR benchmark datasets is evaluated. Expand