Corpus ID: 15280949

A Structured Self-attentive Sentence Embedding

@article{Lin2017ASS,
  title={A Structured Self-attentive Sentence Embedding},
  author={Zhouhan Lin and Minwei Feng and C{\'i}cero Nogueira dos Santos and Mo Yu and Bing Xiang and Bowen Zhou and Yoshua Bengio},
  journal={ArXiv},
  year={2017},
  volume={abs/1703.03130}
}
This paper proposes a new model for extracting an interpretable sentence embedding by introducing self-attention. [...] Key Method Instead of using a vector, we use a 2-D matrix to represent the embedding, with each row of the matrix attending on a different part of the sentence. We also propose a self-attention mechanism and a special regularization term for the model. As a side effect, the embedding comes with an easy way of visualizing what specific parts of the sentence are encoded into the embedding. We…Expand
An Improved Adaptive and Structured Sentence Embedding
  • Ke Fan, H. Li, Xinyue Jiang
  • Computer Science
  • 2019 International Conference on Smart Grid and Electrical Automation (ICSGEA)
  • 2019
TLDR
A new model for extracting an interpretable sentence embedding by introducing an "Adaptive self-attention", which uses a 2-D matrix to represent the embedding and each valid row of the matrix represents a part of sentence. Expand
An Ordered Semantic Embedding for Sentence
In this paper, we propose a method to extract a sentence embedding which reflects different aspects of semantics in a weak order. Most existing methods typically attempt to capture the wholeExpand
Enhancing Sentence Embedding with Generalized Pooling
TLDR
A vector-based multi-head attention that includes the widely used max pooling, mean pooled, and scalar self-attention as special cases is proposed that achieves significant improvement over strong sentence-encoding-based methods. Expand
Enhancing sentence embedding with dynamic interaction
TLDR
A new dynamic interaction method for improving the final sentence representation that aims to make the states of the last layer more conducive to the next classification layer by introducing some constraint from theStates of the previous layers. Expand
A Novel Ensemble Representation Framework for Sentiment Classification
  • M. Sun, I. Hameed, Hao Wang
  • Computer Science
  • 2020 International Joint Conference on Neural Networks (IJCNN)
  • 2020
TLDR
A novel end-to-end framework named Ensemble Framework for Text Embedding (EFTE), which weightedly combines diverse embeddings and simultaneously represents sentences’ and tokens’ features in a more reasonable way is proposed. Expand
Importance of Self-Attention for Sentiment Analysis
TLDR
This paper proposes the Self-Attention Network (SANet), a flexible and interpretable architecture for text classification that highlighted the importance of neighboring word interactions to extract sentiment. Expand
VCWE: Visual Character-Enhanced Word Embeddings
TLDR
A model to learn Chinese word embeddings via three-level composition using a convolutional neural network to extract the intra-character compositionality from the visual shape of a character; a recurrent neural network with self-attention to compose character representation into word embedDings; and the Skip-Gram framework to capture non-compositionality directly from the contextual information. Expand
A Window-Based Self-Attention approach for sentence encoding
TLDR
This work proposes a window-based intra-weighing approach to weigh words in the sentence, which has fewer parameters and much lower computational complexity than state-of-the-art models, and achieves comparable results with them. Expand
Contrasting distinct structured views to learn sentence embeddings
TLDR
A self-supervised method that builds sentence embeddings from the combination of diverse explicit syntactic structures of a sentence and proposes an original contrastive multi-view framework that induces an explicit interaction between models during the training phase. Expand
Unsupervised Summarization by Jointly Extracting Sentences and Keywords
TLDR
It is shown that salient sentences and keywords can be extracted in a joint and mutual reinforcement process using the authors' learned representations, and it is proved that this process always converges to a unique solution leading to improvement in performance. Expand
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 43 REFERENCES
Skip-Thought Vectors
We describe an approach for unsupervised learning of a generic, distributed sentence encoder. Using the continuity of text from books, we train an encoder-decoder model that tries to reconstruct theExpand
Deep Sentence Embedding Using Long Short-Term Memory Networks: Analysis and Application to Information Retrieval
  • H. Palangi, L. Deng, +5 authors R. Ward
  • Computer Science
  • IEEE/ACM Transactions on Audio, Speech, and Language Processing
  • 2016
TLDR
A model that addresses sentence embedding, a hot topic in current natural language processing research, using recurrent neural networks (RNN) with Long Short-Term Memory (LSTM) cells is developed and is shown to significantly outperform several existing state of the art methods. Expand
Learning Natural Language Inference using Bidirectional LSTM model and Inner-Attention
TLDR
A sentence encoding-based model for recognizing text entailment that utilized the sentence's first-stage representation to attend words appeared in itself, which is called "Inner-Attention" in this paper. Expand
A Convolutional Neural Network for Modelling Sentences
TLDR
A convolutional architecture dubbed the Dynamic Convolutional Neural Network (DCNN) is described that is adopted for the semantic modelling of sentences and induces a feature graph over the sentence that is capable of explicitly capturing short and long-range relations. Expand
Discriminative Neural Sentence Modeling by Tree-Based Convolution
TLDR
This paper proposes a tree-based convolutional neural network (TBCNN) for discriminative sentence modeling that outperforms previous state-of-the-art results, including existing neural networks and dedicated feature/rule engineering. Expand
Semi-Supervised Recursive Autoencoders for Predicting Sentiment Distributions
TLDR
A novel machine learning framework based on recursive autoencoders for sentence-level prediction of sentiment label distributions that outperform other state-of-the-art approaches on commonly used datasets, without using any pre-defined sentiment lexica or polarity shifting rules. Expand
Not All Contexts Are Created Equal: Better Word Representations with Variable Attention
We introduce an extension to the bag-ofwords model for learning words representations that take into account both syntactic and semantic properties within language. This is done by employing anExpand
Recursive Deep Models for Semantic Compositionality Over a Sentiment Treebank
TLDR
A Sentiment Treebank that includes fine grained sentiment labels for 215,154 phrases in the parse trees of 11,855 sentences and presents new challenges for sentiment compositionality, and introduces the Recursive Neural Tensor Network. Expand
GloVe: Global Vectors for Word Representation
TLDR
A new global logbilinear regression model that combines the advantages of the two major model families in the literature: global matrix factorization and local context window methods and produces a vector space with meaningful substructure. Expand
Order-Embeddings of Images and Language
TLDR
A general method for learning ordered representations is introduced, and it is shown that the resulting representations improve performance over current approaches for hypernym prediction and image-caption retrieval. Expand
...
1
2
3
4
5
...