• Corpus ID: 252355131

Entity-Centric Query Refinement

  title={Entity-Centric Query Refinement},
  author={David Wadden and Nikita Gupta and Kenton Lee and Kristina Toutanova},
We introduce the task of entity-centric query refinement. Given an input query whose answer is a (potentially large) collection of entities, the task output is a small set of query refinements meant to assist the user in efficient domain exploration and entity discovery. We propose a method to create a training dataset for this task. For a given input query, we use an existing knowledge base taxonomy as a source of candidate query refinements, and choose a final set of refinements from among… 

Learning to Attend, Copy, and Generate for Session-Based Query Suggestion

A customized sequence-to-sequence model for session-based query suggestion that employs a query-aware attention mechanism to capture the structure of the session context and outperforms the baselines both in terms of the generating queries and scoring candidate queries for the task of query suggestion.

A Hierarchical Recurrent Encoder-Decoder for Generative Context-Aware Query Suggestion

This work presents a novel hierarchical recurrent encoder-decoder architecture that makes possible to account for sequences of previous queries of arbitrary lengths and is sensitive to the order of queries in the context while avoiding data sparsity.

Facetedpedia: dynamic generation of query-dependent faceted interfaces for wikipedia

Facetedpedia is a faceted retrieval system for information discovery and exploration in Wikipedia that builds upon the collaborative vocabulary in Wikipedia, more specifically the intensive internal structures (hyperlinks) and folksonomy (category system).

IntenT5: Search Result Diversification using Causal Language Models

This work explores the capacity of causal language models at text generation tasks and finds that to encourage diversity in the generated queries, it is beneficial to adapt the model by including a new Distributional Causal Language Modeling (DCLM) objective during fine-tuning and a representation replacement during inference.

Generating Clarifying Questions for Information Retrieval

A taxonomy of clarification for open-domain search queries is identified by analyzing large-scale query reformulation data sampled from Bing search logs, and supervised and reinforcement learning models for generating clarifying questions learned from weak supervision data are proposed.

Exploiting query reformulations for web search result diversification

A novel probabilistic framework for Web search result diversification, which explicitly accounts for the various aspects associated to an underspecified query, is introduced and diversify a document ranking by estimating how well a given document satisfies each uncovered aspect and the extent to which different aspects are satisfied by the ranking as a whole.

Natural Questions: A Benchmark for Question Answering Research

The Natural Questions corpus, a question answering data set, is presented, introducing robust metrics for the purposes of evaluating question answering systems; demonstrating high human upper bounds on these metrics; and establishing baseline results using competitive methods drawn from related literature.

Entities as Experts: Sparse Memory Access with Entity Supervision

A new model, Entities as Experts (EaE), that can access distinct memories of the entities mentioned in a piece of text that is more modular and interpretable than the Transformer architecture on which it is based is introduced.

Learning Cross-Context Entity Representations from Text

It is shown that large scale training of neural models allows for high quality entity representations, and global entity representations encode fine-grained type categories, such as Scottish footballers, and can answer trivia questions such as: Who was the last inmate of Spandau jail in Berlin?

Asking Clarifying Questions in Open-Domain Information-Seeking Conversations

This paper formulate the task of asking clarifying questions in open-domain information-seeking conversational systems, propose an offline evaluation methodology for the task, and collect a dataset, called Qulac, through crowdsourcing, which significantly outperforms competitive baselines.