ABCNN: Attention-Based Convolutional Neural Network for Modeling Sentence Pairs

@article{Yin2016ABCNNAC,
  title={ABCNN: Attention-Based Convolutional Neural Network for Modeling Sentence Pairs},
  author={Wenpeng Yin and Hinrich Sch{\"u}tze and Bing Xiang and Bowen Zhou},
  journal={Transactions of the Association for Computational Linguistics},
  year={2016},
  volume={4},
  pages={259-272}
}
How to model a pair of sentences is a critical issue in many NLP tasks such as answer selection (AS), paraphrase identification (PI) and textual entailment (TE). Most prior work (i) deals with one individual task by fine-tuning a specific system; (ii) models each sentence’s representation separately, rarely considering the impact of the other sentence; or (iii) relies fully on manually designed, task-specific linguistic features. This work presents a general Attention Based Convolutional Neural… Expand
Enhanced attentive convolutional neural networks for sentence pair modeling
TLDR
This paper proposes Enhanced Attentive Convolutional Neural Networks (EACNNs) for modeling sentence pairs to make full use of the characteristics of convolution, and exploits two attention schemes: attention before representation and attention after representation to capture the interaction information of sentence pairs. Expand
Character-Based Neural Networks for Sentence Pair Modeling
TLDR
The experiments show that subword models without any pretrained word embedding can achieve new state-of-the-art results on two social media datasets and competitive results on news data for paraphrase identification. Expand
THE IMPORTANCE OF SUBWORD EMBEDDINGS IN SENTENCE PAIR MODELING
TLDR
The experiments show that subword models without any pretrained word embedding can achieve new state-of-the-art results on two social media datasets and competitive results on news data for paraphrase identification. Expand
Sentence Relations for Extractive Summarization with Deep Neural Networks
TLDR
A deep neural network model, Sentence Relation-based Summarization (SRSum), that consists of five sub-models, PriorSum, CSRSum, TS RSum, QSRSums, and SFSum, that automatically learns useful latent features by jointly learning representations of query sentences, content sentences, and title sentences as well as their relations. Expand
Multi-Grained-Attention Gated Convolutional Neural Networks for Sentence Classification
TLDR
A new CNN model named MGAGCNN, which generates attention weights from the context window of different sizes based on the MultiGrained-Attention (MGA) gating mechanism, and an activation function named Natural Logarithm rescaled Rectified Linear Unit (NLReLU), which could achieve comparable performances on MGAG CNN to other well-known activation functions. Expand
An Experimental Analysis of Multi-Perspective Convolutional Neural Networks
TLDR
This thesis finds that two key components of MP-CNN are non-essential to achieve competitive accuracy and they make the model less robust to changes in hyperparameters, and suggests simple changes to the architecture that improve the accuracy of the model. Expand
Task-Specific Attentive Pooling of Phrase Alignments Contributes to Sentence Matching
TLDR
An architecture based on Gated Recurrent Unit is proposed that supports representation learning of phrases of arbitrary granularity and task-specific attentive pooling of phrase alignments between two sentences. Expand
Leveraging Contextual Sentence Relations for Extractive Summarization Using a Neural Attention Model
TLDR
A neural network model, Contextual Relation-based Summarization (CRSum), is proposed to take advantage of contextual relations among sentences so as to improve the performance of sentence regression and can achieve comparable performance with state-of-the-art approaches. Expand
An Attention-Gated Convolutional Neural Network for Sentence Classification
TLDR
An Attention-Gated Convolutional Neural Network (AGCNN) for sentence classification, which generates attention weights from the feature's context windows of different sizes by using specialized convolution encoders, is proposed. Expand
Attention-based Convolutional Neural Network for Answer Selection using BERT
TLDR
This work employs BERT, a state-of-the-art pre-trained contextual as the embedding layer and enhances the model by adding some more attentive features, showing that the model is superior to many other answer-selection models. Expand
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 67 REFERENCES
LSTM-based Deep Learning Models for non-factoid answer selection
TLDR
A general deep learning framework is applied for the answer selection task, which does not depend on manually defined features or linguistic tools, and is extended in two directions to define a more composite representation for questions and answers. Expand
A Convolutional Neural Network for Modelling Sentences
TLDR
A convolutional architecture dubbed the Dynamic Convolutional Neural Network (DCNN) is described that is adopted for the semantic modelling of sentences and induces a feature graph over the sentence that is capable of explicitly capturing short and long-range relations. Expand
Deep Learning for Answer Sentence Selection
TLDR
This work proposes a novel approach to solving the answer sentence selection task via means of distributed representations, and learns to match questions with answers by considering their semantic encoding. Expand
A Neural Attention Model for Abstractive Sentence Summarization
TLDR
This work proposes a fully data-driven approach to abstractive sentence summarization by utilizing a local attention-based model that generates each word of the summary conditioned on the input sentence. Expand
A Deep Architecture for Semantic Matching with Multiple Positional Sentence Representations
TLDR
This work presents a new deep architecture to match two sentences with multiple positional sentence representations, generated by a bidirectional long short term memory (Bi-LSTM). Expand
ABC-CNN: An Attention Based Convolutional Neural Network for Visual Question Answering
TLDR
The proposed ABC-CNN architecture for visual question answering task (VQA) achieves significant improvements over state-of-the-art methods on three benchmark VQA datasets and is shown to reflect the regions that are highly relevant to the questions. Expand
Reasoning about Entailment with Neural Attention
TLDR
This paper proposes a neural model that reads two sentences to determine entailment using long short-term memory units and extends this model with a word-by-word neural attention mechanism that encourages reasoning over entailments of pairs of words and phrases, and presents a qualitative analysis of attention weights produced by this model. Expand
Convolutional Neural Network for Paraphrase Identification
TLDR
A new deep learning architecture Bi-CNN-MI for paraphrase identification based on the insight that PI requires comparing two sentences on multiple levels of granularity using convolutional neural network and model interaction features at each level is presented. Expand
Effective Approaches to Attention-based Neural Machine Translation
TLDR
A global approach which always attends to all source words and a local one that only looks at a subset of source words at a time are examined, demonstrating the effectiveness of both approaches on the WMT translation tasks between English and German in both directions. Expand
Multi-Perspective Sentence Similarity Modeling with Convolutional Neural Networks
TLDR
This work proposes a model for comparing sentences that uses a multiplicity of perspectives, first model each sentence using a convolutional neural network that extracts features at multiple levels of granularity and uses multiple types of pooling. Expand
...
1
2
3
4
5
...