• Corpus ID: 15280949

A Structured Self-attentive Sentence Embedding

@article{Lin2017ASS,
  title={A Structured Self-attentive Sentence Embedding},
  author={Zhouhan Lin and Minwei Feng and C{\'i}cero Nogueira dos Santos and Mo Yu and Bing Xiang and Bowen Zhou and Yoshua Bengio},
  journal={ArXiv},
  year={2017},
  volume={abs/1703.03130}
}
This paper proposes a new model for extracting an interpretable sentence embedding by introducing self-attention. [] Key Method Instead of using a vector, we use a 2-D matrix to represent the embedding, with each row of the matrix attending on a different part of the sentence. We also propose a self-attention mechanism and a special regularization term for the model. As a side effect, the embedding comes with an easy way of visualizing what specific parts of the sentence are encoded into the embedding. We…

Figures and Tables from this paper

An Improved Adaptive and Structured Sentence Embedding

  • Ke FanHong LiXinyue Jiang
  • Computer Science
    2019 International Conference on Smart Grid and Electrical Automation (ICSGEA)
  • 2019
A new model for extracting an interpretable sentence embedding by introducing an "Adaptive self-attention", which uses a 2-D matrix to represent the embedding and each valid row of the matrix represents a part of sentence.

A Novel Ensemble Representation Framework for Sentiment Classification

A novel end-to-end framework named Ensemble Framework for Text Embedding (EFTE), which weightedly combines diverse embeddings and simultaneously represents sentences’ and tokens’ features in a more reasonable way is proposed.

Sentiment Analysis with Contextual Embeddings and Self-Attention

This work proposes a simple yet effective method for sentiment analysis using contextual embeddings and a self-attention mechanism that is comparable to or even outperforms state-of-the-art models.

Importance of Self-Attention for Sentiment Analysis

This paper proposes the Self-Attention Network (SANet), a flexible and interpretable architecture for text classification that highlighted the importance of neighboring word interactions to extract sentiment.

VCWE: Visual Character-Enhanced Word Embeddings

A model to learn Chinese word embeddings via three-level composition using a convolutional neural network to extract the intra-character compositionality from the visual shape of a character; a recurrent neural network with self-attention to compose character representation into word embedDings; and the Skip-Gram framework to capture non-compositionality directly from the contextual information.

Contrasting distinct structured views to learn sentence embeddings

A self-supervised method that builds sentence embeddings from the combination of diverse explicit syntactic structures of a sentence and proposes an original contrastive multi-view framework that induces an explicit interaction between models during the training phase.

Unsupervised Summarization by Jointly Extracting Sentences and Keywords

It is shown that salient sentences and keywords can be extracted in a joint and mutual reinforcement process using the authors' learned representations, and it is proved that this process always converges to a unique solution leading to improvement in performance.

A Hierarchical Structured Self-Attentive Model for Extractive Document Summarization (HSSAS)

This work proposed a novel model that addresses several issues that are not adequately modeled by the previously proposed models, such as the memory problem and incorporating the knowledge of document structure, and uses a hierarchical structured self-attention mechanism to create the sentence and document embeddings.

Leveraging contextual embeddings and self-attention neural networks with bi-attention for sentiment analysis

This work proposes a simple yet comprehensive method which uses contextual embeddings and a self-attention mechanism to detect and classify sentiment, and shows the superiority of models leveraging contextualembeddings.
...

References

SHOWING 1-10 OF 37 REFERENCES

Skip-Thought Vectors

We describe an approach for unsupervised learning of a generic, distributed sentence encoder. Using the continuity of text from books, we train an encoder-decoder model that tries to reconstruct the

Learning Natural Language Inference using Bidirectional LSTM model and Inner-Attention

A sentence encoding-based model for recognizing text entailment that utilized the sentence's first-stage representation to attend words appeared in itself, which is called "Inner-Attention" in this paper.

A Convolutional Neural Network for Modelling Sentences

A convolutional architecture dubbed the Dynamic Convolutional Neural Network (DCNN) is described that is adopted for the semantic modelling of sentences and induces a feature graph over the sentence that is capable of explicitly capturing short and long-range relations.

Discriminative Neural Sentence Modeling by Tree-Based Convolution

This paper proposes a tree-based convolutional neural network (TBCNN) for discriminative sentence modeling that outperforms previous state-of-the-art results, including existing neural networks and dedicated feature/rule engineering.

Semi-Supervised Recursive Autoencoders for Predicting Sentiment Distributions

A novel machine learning framework based on recursive autoencoders for sentence-level prediction of sentiment label distributions that outperform other state-of-the-art approaches on commonly used datasets, without using any pre-defined sentiment lexica or polarity shifting rules.

Not All Contexts Are Created Equal: Better Word Representations with Variable Attention

We introduce an extension to the bag-ofwords model for learning words representations that take into account both syntactic and semantic properties within language. This is done by employing an

Recursive Deep Models for Semantic Compositionality Over a Sentiment Treebank

A Sentiment Treebank that includes fine grained sentiment labels for 215,154 phrases in the parse trees of 11,855 sentences and presents new challenges for sentiment compositionality, and introduces the Recursive Neural Tensor Network.

Order-Embeddings of Images and Language

A general method for learning ordered representations is introduced, and it is shown that the resulting representations improve performance over current approaches for hypernym prediction and image-caption retrieval.

Convolutional Neural Networks for Sentence Classification

The CNN models discussed herein improve upon the state of the art on 4 out of 7 tasks, which include sentiment analysis and question classification, and are proposed to allow for the use of both task-specific and static vectors.

Dependency-based Convolutional Neural Networks for Sentence Embedding

This work proposes a tree-based convolutional neural network model which exploit various long-distance relationships between words, which improves the sequential baselines on all three sentiment and question classification tasks, and achieves the highest published accuracy on TREC.