Corpus ID: 14163772

Attentive Pooling Networks

@article{Santos2016AttentivePN,
  title={Attentive Pooling Networks},
  author={C{\'i}cero Nogueira dos Santos and Ming Tan and Bing Xiang and Bowen Zhou},
  journal={ArXiv},
  year={2016},
  volume={abs/1602.03609}
}
In this work, we propose Attentive Pooling (AP), a two-way attention mechanism for discriminative model training. In the context of pair-wise ranking or classification with neural networks, AP enables the pooling layer to be aware of the current input pair, in a way that information from the two input items can directly influence the computation of each other's representations. Along with such representations of the paired inputs, AP jointly learns a similarity measure over projected segments… Expand
Attentive Convolution: Equipping CNNs with RNN-style Attention Mechanisms
TLDR
The proposed attentive convolution network, ATTCONV, extends the context scope of the convolution operation, deriving higher-level features for a word not only from local context, but also from information extracted from nonlocal context by the attention mechanism commonly used in RNNs. Expand
Multi-Cast Attention Networks
TLDR
This paper proposes Multi-Cast Attention Networks (MCAN), a new attention mechanism and general model architecture for a potpourri of ranking tasks in the conversational modeling and question answering domains and shows that MCAN achieves state-of-the-art performance. Expand
Enhanced Embedding Based Attentive Pooling Network for Answer Selection
TLDR
An enhanced CNN with extended order information into inputting embedding is designed, which achieves a good performance on the DBQA task of NLPCC 2017 without any other extra features. Expand
Hyperbolic Representation Learning for Fast and Efficient Neural Question Answering
TLDR
This model is a parameter efficient neural network that outperforms other parameter intensive models such as Attentive Pooling BiLSTMs and Multi-Perspective CNNs on multiple QA benchmarks and requires no feature engineering, no similarity matrix matching, no complicated attention mechanisms nor over-parameterized layers and yet outperforms and remains competitive to many models that have these functionalities on multiple benchmarks. Expand
Enhanced attentive convolutional neural networks for sentence pair modeling
TLDR
This paper proposes Enhanced Attentive Convolutional Neural Networks (EACNNs) for modeling sentence pairs to make full use of the characteristics of convolution, and exploits two attention schemes: attention before representation and attention after representation to capture the interaction information of sentence pairs. Expand
Double Attention Mechanism for Sentence Embedding
TLDR
Experimental results show that the proposed model yields a significant performance gain compared to other sentence embedding methods in all the three dataset, and the model can be trained end-to-end with limited hyper-parameters. Expand
Attention-based Convolutional Neural Network for Answer Selection using BERT
TLDR
This work employs BERT, a state-of-the-art pre-trained contextual as the embedding layer and enhances the model by adding some more attentive features, showing that the model is superior to many other answer-selection models. Expand
the interaction of each segment in question with all segments in answer
Answer selection plays a key role in community question answering (CQA). Previous research on answer selection usually ignores the problems of redundancy and noise prevalent in CQA. In this paper, weExpand
Gated Group Self-Attention for Answer Selection
TLDR
Experimental results on two popular QA datasets show that GGSA can outperform existing answer selection models to achieve state-of-the-art performance and can also achieve higher accuracy than global self-attention for the answer selection task, with a lower computation cost. Expand
Multihop Attention Networks for Question Answer Matching
TLDR
This paper proposes Multihop Attention Networks (MAN) which use multiple vectors which focus on different parts of the question for its overall semantic representation and apply multiple steps of attention to learn representations for the candidate answers. Expand
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 32 REFERENCES
ABCNN: Attention-Based Convolutional Neural Network for Modeling Sentence Pairs
TLDR
This work presents a general Attention Based Convolutional Neural Network (ABCNN) for modeling a pair of sentences and proposes three attention schemes that integrate mutual influence between sentences into CNNs; thus, the representation of each sentence takes into consideration its counterpart. Expand
Learning to Rank Short Text Pairs with Convolutional Deep Neural Networks
TLDR
This paper presents a convolutional neural network architecture for reranking pairs of short texts, where the optimal representation of text pairs and a similarity function to relate them in a supervised way from the available training data are learned. Expand
Sequence to Sequence Learning with Neural Networks
TLDR
This paper presents a general end-to-end approach to sequence learning that makes minimal assumptions on the sequence structure, and finds that reversing the order of the words in all source sentences improved the LSTM's performance markedly, because doing so introduced many short term dependencies between the source and the target sentence which made the optimization problem easier. Expand
Convolutional Neural Network Architectures for Matching Natural Language Sentences
TLDR
Convolutional neural network models for matching two sentences are proposed, by adapting the convolutional strategy in vision and speech and nicely represent the hierarchical structures of sentences with their layer-by-layer composition and pooling. Expand
LSTM-based Deep Learning Models for non-factoid answer selection
TLDR
A general deep learning framework is applied for the answer selection task, which does not depend on manually defined features or linguistic tools, and is extended in two directions to define a more composite representation for questions and answers. Expand
Speech recognition with deep recurrent neural networks
TLDR
This paper investigates deep recurrent neural networks, which combine the multiple levels of representation that have proved so effective in deep networks with the flexible use of long range context that empowers RNNs. Expand
Show, Attend and Tell: Neural Image Caption Generation with Visual Attention
TLDR
An attention based model that automatically learns to describe the content of images is introduced that can be trained in a deterministic manner using standard backpropagation techniques and stochastically by maximizing a variational lower bound. Expand
Reasoning about Entailment with Neural Attention
TLDR
This paper proposes a neural model that reads two sentences to determine entailment using long short-term memory units and extends this model with a word-by-word neural attention mechanism that encourages reasoning over entailments of pairs of words and phrases, and presents a qualitative analysis of attention weights produced by this model. Expand
A Latent Semantic Model with Convolutional-Pooling Structure for Information Retrieval
TLDR
A new latent semantic model that incorporates a convolutional-pooling structure over word sequences to learn low-dimensional, semantic vector representations for search queries and Web documents is proposed. Expand
Deep Learning for Answer Sentence Selection
TLDR
This work proposes a novel approach to solving the answer sentence selection task via means of distributed representations, and learns to match questions with answers by considering their semantic encoding. Expand
...
1
2
3
4
...