• Corpus ID: 244346065

The Power of Selecting Key Blocks with Local Pre-ranking for Long Document Information Retrieval

  title={The Power of Selecting Key Blocks with Local Pre-ranking for Long Document Information Retrieval},
  author={Minghan Li and Diana Nicoleta Popa and Johan Chagnon and Yagmur Gizem Cinar and {\'E}ric Gaussier},
On a wide range of natural language processing and information retrieval tasks, transformer-based models, particularly pre-trained language models like BERT, have demonstrated tremendous effectiveness. Due to the quadratic complexity of the self-attention mechanism, however, such models have difficulties processing long documents. Recent works dealing with this issue include truncating long documents, segmenting them into passages that can be treated by a standard BERT model, or modifying the… 


KeyBLD: Selecting Key Blocks with Local Pre-ranking for Long Document Information Retrieval
This work follows a slightly different approach in which one first selects key blocks of a long document by local query-block pre-ranking, and then aggregates few blocks to form a short document that can be processed by a model such as BERT.
Local Self-Attention over Long Text for Efficient Document Retrieval
A local self-attention which considers a moving window over the document terms and for each term attends only to other terms in the same window resulting in increased retrieval of longer documents at moderate increase in compute and memory costs is proposed.
Intra-Document Cascading: Learning to Select Passages for Neural Document Ranking
The proposed Intra-Document Cascaded Ranking Model (IDCM) leads to over 400% lower query latency by providing essentially the same effectiveness as the state-of-the-art BERT-based document ranking models.
Long Document Ranking with Query-Directed Sparse Transformer
This paper designs Query-Directed Sparse attention that induces IR-axiomatic structures in transformer self-attention that enforces the principle properties desired in ranking: local contextualization, hierarchical representation, and query-oriented proximity matching, while it also enjoys efficiency from sparsity.
Globalizing BERT-based Transformer Architectures for Long Document Summarization
This work introduces a novel hierarchical propagation layer that spreads information between multiple transformer windows and adopts a hierarchical approach where the input is divided in multiple blocks independently processed by the scaled dot-attentions and combined between the successive layers.
Modeling Diverse Relevance Patterns in Ad-hoc Retrieval
This work proposes a data-driven method to allow relevance signals at different granularities to compete with each other for final relevance assessment and demonstrates that the HiNT model outperforms existing state-of-the-art retrieval models significantly on benchmark ad-hoc retrieval datasets.
Interpretable & Time-Budget-Constrained Contextualization for Re-Ranking
TK (Transformer-Kernel): a neural re-ranking model for ad-hoc search using an efficient contextualization mechanism that achieves the highest effectiveness in comparison to BERT and other re- ranking models is proposed.
Learning deep structured semantic models for web search using clickthrough data
A series of new latent semantic models with a deep structure that project queries and documents into a common low-dimensional space where the relevance of a document given a query is readily computed as the distance between them are developed.
NPRF: A Neural Pseudo Relevance Feedback Framework for Ad-hoc Information Retrieval
This work proposes an end-to-end neural PRF framework that can be used with existing neural IR models by embedding different neural models as building blocks and confirms the effectiveness of the proposed NPRF framework in improving the performance of two state-of-the-art neuralIR models.
CEDR: Contextualized Embeddings for Document Ranking
This work investigates how two pretrained contextualized language models (ELMo and BERT) can be utilized for ad-hoc document ranking and proposes a joint approach that incorporates BERT's classification vector into existing neural models and shows that it outperforms state-of-the-art ad-Hoc ranking baselines.