• Corpus ID: 14163772

Attentive Pooling Networks

@article{Santos2016AttentivePN,
  title={Attentive Pooling Networks},
  author={C{\'i}cero Nogueira dos Santos and Ming Tan and Bing Xiang and Bowen Zhou},
  journal={ArXiv},
  year={2016},
  volume={abs/1602.03609}
}
In this work, we propose Attentive Pooling (AP), a two-way attention mechanism for discriminative model training. In the context of pair-wise ranking or classification with neural networks, AP enables the pooling layer to be aware of the current input pair, in a way that information from the two input items can directly influence the computation of each other's representations. Along with such representations of the paired inputs, AP jointly learns a similarity measure over projected segments… 

Figures and Tables from this paper

Attentive Convolution: Equipping CNNs with RNN-style Attention Mechanisms

The proposed attentive convolution network, ATTCONV, extends the context scope of the convolution operation, deriving higher-level features for a word not only from local context, but also from information extracted from nonlocal context by the attention mechanism commonly used in RNNs.

Multi-Cast Attention Networks

This paper proposes Multi-Cast Attention Networks (MCAN), a new attention mechanism and general model architecture for a potpourri of ranking tasks in the conversational modeling and question answering domains and shows that MCAN achieves state-of-the-art performance.

Enhanced Embedding Based Attentive Pooling Network for Answer Selection

An enhanced CNN with extended order information into inputting embedding is designed, which achieves a good performance on the DBQA task of NLPCC 2017 without any other extra features.

Hyperbolic Representation Learning for Fast and Efficient Neural Question Answering

This model is a parameter efficient neural network that outperforms other parameter intensive models such as Attentive Pooling BiLSTMs and Multi-Perspective CNNs on multiple QA benchmarks and requires no feature engineering, no similarity matrix matching, no complicated attention mechanisms nor over-parameterized layers and yet outperforms and remains competitive to many models that have these functionalities on multiple benchmarks.

Double Attention Mechanism for Sentence Embedding

Experimental results show that the proposed model yields a significant performance gain compared to other sentence embedding methods in all the three dataset, and the model can be trained end-to-end with limited hyper-parameters.

Attention, please! A survey of neural attention models in deep learning

This review systematically reviewed hundreds of architectures in the area, identifying and discussing those in which attention has shown a significant impact and described the primary uses of attention in convolutional, recurrent networks, and generative models.

Attention-based Convolutional Neural Network for Answer Selection using BERT

This work employs BERT, a state-of-the-art pre-trained contextual as the embedding layer and enhances the model by adding some more attentive features, showing that the model is superior to many other answer-selection models.

Combination of Multiple Embeddings for Speaker Retrieval

An extension of architecture based on ECAPA-TDNN that combines multiple embeddings in different layers is proposed that obtains a relative improvement comparing with the SOTA model before and can better cluster utterances from the same speaker.

the interaction of each segment in question with all segments in answer

This paper proposes to treat different text segments differently and design a novel attentive interactive neural network (AI-NN) to focus on those text segments useful to answer selection to demonstrate that AI-NN outperforms state-of-the-art method.
...

References

SHOWING 1-10 OF 26 REFERENCES

ABCNN: Attention-Based Convolutional Neural Network for Modeling Sentence Pairs

This work presents a general Attention Based Convolutional Neural Network (ABCNN) for modeling a pair of sentences and proposes three attention schemes that integrate mutual influence between sentences into CNNs; thus, the representation of each sentence takes into consideration its counterpart.

Learning to Rank Short Text Pairs with Convolutional Deep Neural Networks

This paper presents a convolutional neural network architecture for reranking pairs of short texts, where the optimal representation of text pairs and a similarity function to relate them in a supervised way from the available training data are learned.

Sequence to Sequence Learning with Neural Networks

This paper presents a general end-to-end approach to sequence learning that makes minimal assumptions on the sequence structure, and finds that reversing the order of the words in all source sentences improved the LSTM's performance markedly, because doing so introduced many short term dependencies between the source and the target sentence which made the optimization problem easier.

Convolutional Neural Network Architectures for Matching Natural Language Sentences

Convolutional neural network models for matching two sentences are proposed, by adapting the convolutional strategy in vision and speech and nicely represent the hierarchical structures of sentences with their layer-by-layer composition and pooling.

LSTM-based Deep Learning Models for non-factoid answer selection

A general deep learning framework is applied for the answer selection task, which does not depend on manually defined features or linguistic tools, and is extended in two directions to define a more composite representation for questions and answers.

Speech recognition with deep recurrent neural networks

This paper investigates deep recurrent neural networks, which combine the multiple levels of representation that have proved so effective in deep networks with the flexible use of long range context that empowers RNNs.

Show, Attend and Tell: Neural Image Caption Generation with Visual Attention

An attention based model that automatically learns to describe the content of images is introduced that can be trained in a deterministic manner using standard backpropagation techniques and stochastically by maximizing a variational lower bound.

Reasoning about Entailment with Neural Attention

This paper proposes a neural model that reads two sentences to determine entailment using long short-term memory units and extends this model with a word-by-word neural attention mechanism that encourages reasoning over entailments of pairs of words and phrases, and presents a qualitative analysis of attention weights produced by this model.

Deep Learning for Answer Sentence Selection

This work proposes a novel approach to solving the answer sentence selection task via means of distributed representations, and learns to match questions with answers by considering their semantic encoding.

FAQ-based Question Answering via Word Alignment

A novel word-alignment-based method to solve the FAQ-based question answering task, where the word alignment between two questions is used for extracting features, and a bootstrap-based feature extraction method to extract a small set of effective lexical features.