Rethinking search

  title={Rethinking search},
  author={Donald Metzler and Yi Tay and Dara Bahri and Marc Najork},
  journal={ACM SIGIR Forum},
  pages={1 - 27}
When experiencing an information need, users want to engage with a domain expert, but often turn to an information retrieval system, such as a search engine, instead. Classical information retrieval systems do not answer information needs directly, but instead provide references to (hopefully authoritative) answers. Successful question answering systems offer a limited corpus created on-demand by human experts, which is neither timely nor scalable. Pre-trained language models, by contrast, are… 
1 Citations

Figures from this paper

Situating Search
  • C. Shah, Emily M. Bender
  • Computer Science
    ACM SIGIR Conference on Human Information Interaction and Retrieval
  • 2022
It is argued that removing or reducing interactions in an effort to retrieve presumably more relevant information can be detrimental to many fundamental aspects of search, including information verification, information literacy, and serendipity.


The NarrativeQA Reading Comprehension Challenge
A new dataset and set of tasks in which the reader must answer questions about stories by reading entire books or movie scripts are presented, designed so that successfully answering their questions requires understanding the underlying narrative rather than relying on shallow pattern matching or salience.
Key-Value Memory Networks for Directly Reading Documents
This work introduces a new method, Key-Value Memory Networks, that makes reading documents more viable by utilizing different encodings in the addressing and output stages of the memory read operation.
Latent Retrieval for Weakly Supervised Open Domain Question Answering
It is shown for the first time that it is possible to jointly learn the retriever and reader from question-answer string pairs and without any IR system, and outperforming BM25 by up to 19 points in exact match.
Language Models are Unsupervised Multitask Learners
It is demonstrated that language models begin to learn these tasks without any explicit supervision when trained on a new dataset of millions of webpages called WebText, suggesting a promising path towards building language processing systems which learn to perform tasks from their naturally occurring demonstrations.
Quasar: Datasets for Question Answering by Search and Reading
We present two new large-scale datasets aimed at evaluating systems designed to comprehend a natural language query and extract its answer from a large corpus of text. The Quasar-S dataset consists
SearchQA: A New Q&A Dataset Augmented with Context from a Search Engine
It is shown that there is a meaningful gap between the human and machine performances, which suggests that the proposed dataset could well serve as a benchmark for question-answering.
Co-PACRR: A Context-Aware Neural IR Model for Ad-hoc Retrieval
This work highlights three potential shortcomings caused by not considering context information and proposes three neural ingredients to address them: a disambiguation component, cascade k-max pooling, and a shuffling combination layer that yields Co-PACER, a novel context-aware neural IR model.
Machine Comprehension Using Match-LSTM and Answer Pointer
This work proposes an end-to-end neural architecture for the Stanford Question Answering Dataset (SQuAD), based on match-LSTM, a model previously proposed previously for textual entailment, and Pointer Net, a sequence- to-sequence model proposed by Vinyals et al.(2015) to constrain the output tokens to be from the input sequences.
Get To The Point: Summarization with Pointer-Generator Networks
A novel architecture that augments the standard sequence-to-sequence attentional model in two orthogonal ways, using a hybrid pointer-generator network that can copy words from the source text via pointing, which aids accurate reproduction of information, while retaining the ability to produce novel words through the generator.
An Introduction to Neural Information Retrieval
The monograph provides a complete picture of neural information retrieval techniques that culminate in supervised neural learning to rank models including deep neural network architectures that are trained end-to-end for ranking tasks.