Corpus ID: 8535316

Bidirectional Attention Flow for Machine Comprehension

@article{Seo2017BidirectionalAF,
  title={Bidirectional Attention Flow for Machine Comprehension},
  author={Minjoon Seo and Aniruddha Kembhavi and Ali Farhadi and Hannaneh Hajishirzi},
  journal={ArXiv},
  year={2017},
  volume={abs/1611.01603}
}
Machine comprehension (MC), answering a query about a given context paragraph, requires modeling complex interactions between the context and the query. [...] Key Method In this paper we introduce the Bi-Directional Attention Flow (BIDAF) network, a multi-stage hierarchical process that represents the context at different levels of granularity and uses bi-directional attention flow mechanism to obtain a query-aware context representation without early summarization. Our experimental evaluations show that our…Expand
Gated Self-Attention for SQuAD Question Answering
  • 2021
Machine comprehension and question answering are central questions in natural language processing, as they require modeling interactions between the passage and the question. In this paper, we buildExpand
Pay More Attention - Neural Architectures for Question-Answering
TLDR
This paper studies two state-of-the-art attention mechanisms called Bi-Directional Attention Flow (BiDAF) and Dynamic Co-Attention Network (DCN) and proposes a hybrid scheme combining these two architectures that gives better overall performance and suggests a new simpler attention mechanism that provides better results. Expand
GSA-Net: gated scaled dot-product attention based neural network for reading comprehension
TLDR
A Gated Scaled Dot-Product Attention based model for RC task where character-level embedding is incorporated into the word embedding which is helpful to deal with Out-of-Vocabulary tokens and self-matching attention mechanism is adopted to resolve the problem of long-distance dependency. Expand
Unilateral Multi-Perspective Matching for Machine Comprehension
Machine Comprehension (MC) involves answering a question with a given context paragraph. Recently, methods involving attention between the question and the paragraph have proved successful in thisExpand
A new model for Machine Comprehension via multi-perspective context matching and bidrectional attention flow
To answer a question about a context paragraph, there needs to be a complex model for interactions between these two. Previous Machine Comprehension (MC) where either not large enough to trainExpand
Question Answering with Gated Attention and Multitask Learning-Option 3 (Graded)
Recently, there is a growing interest in the research of machine text comprehension and automatic question answering. Many state-of-the-art QA models consist of two components: recurrent neuralExpand
HIERARCHICAL ATTENTION: WHAT REALLY COUNTS
  • 2018
Attention mechanisms in sequence to sequence models have shown great ability and wonderful performance in various natural language processing (NLP) tasks, such as sentence embedding, text generation,Expand
Query Attention GloVe GloVe CNN Attention Flow Layer Modeling Layer Output Layer
Machine Comprehension is a daunting task, since it requires cross-encoding and exchanging information between a context paragraph and a given query in order to produce an answer span. In designingExpand
Hierarchical Attention: What Really Counts in Various NLP Tasks
TLDR
This paper thinks that the lack of hierarchical mechanisms is a bottleneck in improving the performance of the attention mechanisms, and proposes a novel Hierarchical Attention Mechanism (Ham) based on the weighted sum of different layers of a multi-level attention. Expand
Fine-Grained Gating Based on Question-Summary for Machine Comprehension
TLDR
This paper proposes to use a fine-grained gating mechanism that controls the flow of information from the Context Encoder towards the Interaction Encoder, based on the Question-summary and the input vector at some time step. Expand
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 39 REFERENCES
Gated-Attention Readers for Text Comprehension
TLDR
The model, the Gated-Attention (GA) Reader, integrates a multi-hop architecture with a novel attention mechanism, which is based on multiplicative interactions between the query embedding and the intermediate states of a recurrent neural network document reader, which enables the reader to build query-specific representations of tokens in the document for accurate answer selection. Expand
Attention-over-Attention Neural Networks for Reading Comprehension
TLDR
Experimental results show that the proposed attention-over-attention model significantly outperforms various state-of-the-art systems by a large margin in public datasets, such as CNN and Children's Book Test datasets. Expand
Learning Recurrent Span Representations for Extractive Question Answering
TLDR
This paper presents a novel model architecture that efficiently builds fixed length representations of all spans in the evidence document with a recurrent network, and shows that scoring explicit span representations significantly improves performance over other approaches that factor the prediction into separate predictions about words or start and end markers. Expand
Text Understanding with the Attention Sum Reader Network
TLDR
A new, simple model is presented that uses attention to directly pick the answer from the context as opposed to computing the answer using a blended representation of words in the document as is usual in similar models, making the model particularly suitable for question-answering problems where the answer is a single word from the document. Expand
Iterative Alternating Neural Attention for Machine Reading
TLDR
This work proposes a novel neural attention architecture to tackle machine comprehension tasks, such as answering Cloze-style queries with respect to a document that outperforms state-of-the-art baselines in standard machine comprehension benchmarks such as CNN news articles and the Children’s Book Test dataset. Expand
Machine Comprehension Using Match-LSTM and Answer Pointer
TLDR
This work proposes an end-to-end neural architecture for the Stanford Question Answering Dataset (SQuAD), based on match-LSTM, a model previously proposed previously for textual entailment, and Pointer Net, a sequence- to-sequence model proposed by Vinyals et al.(2015) to constrain the output tokens to be from the input sequences. Expand
End-to-End Answer Chunk Extraction and Ranking for Reading Comprehension
This paper proposes dynamic chunk reader (DCR), an end-to-end neural reading comprehension (RC) model that is able to extract and rank a set of answer candidates from a given document to answerExpand
Dynamic Memory Networks for Visual and Textual Question Answering
TLDR
The new DMN+ model improves the state of the art on both the Visual Question Answering dataset and the \babi-10k text question-answering dataset without supporting fact supervision. Expand
Stacked Attention Networks for Image Question Answering
TLDR
A multiple-layer SAN is developed in which an image is queried multiple times to infer the answer progressively, and the progress that the SAN locates the relevant visual clues that lead to the answer of the question layer-by-layer. Expand
Dynamic Coattention Networks For Question Answering
TLDR
The Dynamic Coattention Network (DCN) for question answering first fuses co-dependent representations of the question and the document in order to focus on relevant parts of both, then a dynamic pointing decoder iterates over potential answer spans to recover from initial local maxima corresponding to incorrect answers. Expand
...
1
2
3
4
...