Corpus ID: 221857196

Using the Hammer Only on Nails: A Hybrid Method for Evidence Retrieval for Question Answering

  title={Using the Hammer Only on Nails: A Hybrid Method for Evidence Retrieval for Question Answering},
  author={Zhengzhong Liang and Yiyun Zhao and M. Surdeanu},
Evidence retrieval is a key component of explainable question answering (QA). We argue that, despite recent progress, transformer network-based approaches such as universal sentence encoder (USE-QA) do not always outperform traditional information retrieval (IR) methods such as BM25 for evidence retrieval for QA. We introduce a lexical probing task that validates this observation: we demonstrate that neural IR methods have the capacity to capture lexical differences between questions and… Expand

Figures and Tables from this paper


Latent Retrieval for Weakly Supervised Open Domain Question Answering
It is shown for the first time that it is possible to jointly learn the retriever and reader from question-answer string pairs and without any IR system, and outperforming BM25 by up to 19 points in exact match. Expand
ReQA: An Evaluation for End-to-End Answer Retrieval Models
Retrieval Question Answering (ReQA), a benchmark for evaluating large-scale sentence- and paragraph-level answer retrieval models, is introduced and baselines are established using both neural encoding models as well as classical information retrieval techniques. Expand
Answering questions by learning to rank - Learning to rank by answering questions
This article describes a method which can be used to semantically rank documents extracted from Wikipedia or similar natural language corpora and proposes a model employing the semantic ranking that holds the first place in two of the most popular leaderboards for answering multiple-choice questions. Expand
Can a Suit of Armor Conduct Electricity? A New Dataset for Open Book Question Answering
A new kind of question answering dataset, OpenBookQA, modeled after open book exams for assessing human understanding of a subject, and oracle experiments designed to circumvent the knowledge retrieval bottleneck demonstrate the value of both the open book and additional facts. Expand
Revealing the Importance of Semantic Retrieval for Machine Reading at Scale
This work proposes a simple yet effective pipeline system with special consideration on hierarchical semantic retrieval at both paragraph and sentence level, and their potential effects on the downstream task, and illustrates that intermediate semantic retrieval modules are vital for shaping upstream data distribution and providing better data for downstream modeling. Expand
Neural Ranking Models with Weak Supervision
This paper proposes to train a neural ranking model using weak supervision, where labels are obtained automatically without human annotators or any external resources, and suggests that supervised neural ranking models can greatly benefit from pre-training on large amounts of weakly labeled data that can be easily obtained from unsupervised IR models. Expand
Natural Questions: A Benchmark for Question Answering Research
The Natural Questions corpus, a question answering data set, is presented, introducing robust metrics for the purposes of evaluating question answering systems; demonstrating high human upper bounds on these metrics; and establishing baseline results using competitive methods drawn from related literature. Expand
A Deep Look into Neural Ranking Models for Information Retrieval
A deep look into the neural ranking models from different dimensions is taken to analyze their underlying assumptions, major design principles, and learning strategies to obtain a comprehensive empirical understanding of the existing techniques. Expand
Multilingual Universal Sentence Encoder for Semantic Retrieval
On transfer learning tasks, the multilingual embeddings approach, and in some cases exceed, the performance of English only sentence embedDings. Expand
Bridging the lexical chasm: statistical approaches to answer-finding
It is shown that the task of “answer-finding” differs from both document retrieval and tradition question-answering, presenting challenges different from those found in these problems. Expand