• Corpus ID: 8535316

Bidirectional Attention Flow for Machine Comprehension

  title={Bidirectional Attention Flow for Machine Comprehension},
  author={Minjoon Seo and Aniruddha Kembhavi and Ali Farhadi and Hannaneh Hajishirzi},
Machine comprehension (MC), answering a query about a given context paragraph, requires modeling complex interactions between the context and the query. [] Key Method In this paper we introduce the Bi-Directional Attention Flow (BIDAF) network, a multi-stage hierarchical process that represents the context at different levels of granularity and uses bi-directional attention flow mechanism to obtain a query-aware context representation without early summarization. Our experimental evaluations show that our…

Figures and Tables from this paper

Gated Self-Attention for SQuAD Question Answering

  • Computer Science
  • 2021
This paper builds on the multi-stage hierarchical process BiDAF described in Seo et al. (2017)’s Bi-Directional Attention Flow for Machine Comprehension by introducing a modified form of Bi DAF which utilizes both an LSTM and a CNN in its encoding layer, as well as BiDAFs context-query attention layer followed by R-Net's self-attention layer.

Pay More Attention - Neural Architectures for Question-Answering

This paper studies two state-of-the-art attention mechanisms called Bi-Directional Attention Flow (BiDAF) and Dynamic Co-Attention Network (DCN) and proposes a hybrid scheme combining these two architectures that gives better overall performance and suggests a new simpler attention mechanism that provides better results.

GSA-Net: gated scaled dot-product attention based neural network for reading comprehension

A Gated Scaled Dot-Product Attention based model for RC task where character-level embedding is incorporated into the word embedding which is helpful to deal with Out-of-Vocabulary tokens and self-matching attention mechanism is adopted to resolve the problem of long-distance dependency.

Unilateral Multi-Perspective Matching for Machine Comprehension

It is discovered that Unilateral Multi-Perspective Matching is a powerful tool and boosts performances significantly, as demonstrated by the F1 and EM scores on the SQuAD Test Set of 66.2% and 54.6%, respectively.

A new model for Machine Comprehension via multi-perspective context matching and bidrectional attention flow

This work combines the ideas of two state-of-the-art models (BiDAF and MPCM) with new ideas to obtain a new model for question answering task and promising experimental results on the test set of SQuAD encourages us to continue working on the proposed model.

Question Answering with Gated Attention and Multitask Learning-Option 3 (Graded)

This paper started with the BiDAF, and included character level embedding, self attention mechanism with gated residual blocks to enhance performance, and discussed QANet’s non-recurrent architecture, and applied character-level embedding and residual attention moduel toQANet.


  • Computer Science
  • 2018
This paper thinks that the lack of hierarchical mechanisms is a bottleneck in improving the performance of the attention mechanisms, and proposes a novel Hierarchical Attention Mechanism (Ham) based on the weighted sum of different layers of a multi-level attention.

Query Attention GloVe GloVe CNN Attention Flow Layer Modeling Layer Output Layer

This paper discusses how to apply convolutional neural networks (CNNs) to the machine comprehension task and incorporates CNNs with existing bidirectional attention-flow mechanisms and compares the performance to RNN-based models.

Hierarchical Attention: What Really Counts in Various NLP Tasks

This paper thinks that the lack of hierarchical mechanisms is a bottleneck in improving the performance of the attention mechanisms, and proposes a novel Hierarchical Attention Mechanism (Ham) based on the weighted sum of different layers of a multi-level attention.

Fine-Grained Gating Based on Question-Summary for Machine Comprehension

This paper proposes to use a fine-grained gating mechanism that controls the flow of information from the Context Encoder towards the Interaction Encoder, based on the Question-summary and the input vector at some time step.



Gated-Attention Readers for Text Comprehension

The Gated-Attention (GA) Reader, a model that integrates a multi-hop architecture with a novel attention mechanism, which is based on multiplicative interactions between the query embedding and the intermediate states of a recurrent neural network document reader, enables the reader to build query-specific representations of tokens in the document for accurate answer selection.

Attention-over-Attention Neural Networks for Reading Comprehension

A simple but novel model called attention-over-attention reader for better solving cloze-style reading comprehension task, which significantly outperform various state-of-the-art systems by a large margin in public datasets, such as CNN and Children’s Book Test.

Learning Recurrent Span Representations for Extractive Question Answering

This paper presents a novel model architecture that efficiently builds fixed length representations of all spans in the evidence document with a recurrent network, and shows that scoring explicit span representations significantly improves performance over other approaches that factor the prediction into separate predictions about words or start and end markers.

Text Understanding with the Attention Sum Reader Network

A new, simple model is presented that uses attention to directly pick the answer from the context as opposed to computing the answer using a blended representation of words in the document as is usual in similar models, making the model particularly suitable for question-answering problems where the answer is a single word from the document.

Iterative Alternating Neural Attention for Machine Reading

This work proposes a novel neural attention architecture to tackle machine comprehension tasks, such as answering Cloze-style queries with respect to a document that outperforms state-of-the-art baselines in standard machine comprehension benchmarks such as CNN news articles and the Children’s Book Test dataset.

Machine Comprehension Using Match-LSTM and Answer Pointer

This work proposes an end-to-end neural architecture for the Stanford Question Answering Dataset (SQuAD), based on match-LSTM, a model previously proposed previously for textual entailment, and Pointer Net, a sequence- to-sequence model proposed by Vinyals et al.(2015) to constrain the output tokens to be from the input sequences.

End-to-End Answer Chunk Extraction and Ranking for Reading Comprehension

This paper proposes dynamic chunk reader (DCR), an end-to-end neural reading comprehension (RC) model that is able to extract and rank a set of answer candidates from a given document to answer

Dynamic Memory Networks for Visual and Textual Question Answering

The new DMN+ model improves the state of the art on both the Visual Question Answering dataset and the \babi-10k text question-answering dataset without supporting fact supervision.

Stacked Attention Networks for Image Question Answering

A multiple-layer SAN is developed in which an image is queried multiple times to infer the answer progressively, and the progress that the SAN locates the relevant visual clues that lead to the answer of the question layer-by-layer.

Dynamic Coattention Networks For Question Answering

The Dynamic Coattention Network (DCN) for question answering first fuses co-dependent representations of the question and the document in order to focus on relevant parts of both, then a dynamic pointing decoder iterates over potential answer spans to recover from initial local maxima corresponding to incorrect answers.