Supervised Learning of Universal Sentence Representations from Natural Language Inference Data
@article{Conneau2017SupervisedLO, title={Supervised Learning of Universal Sentence Representations from Natural Language Inference Data}, author={Alexis Conneau and Douwe Kiela and Holger Schwenk and Lo{\"i}c Barrault and Antoine Bordes}, journal={ArXiv}, year={2017}, volume={abs/1705.02364} }
Many modern NLP systems rely on word embeddings, previously trained in an unsupervised manner on large corpora, as base features. Efforts to obtain embeddings for larger chunks of text, such as sentences, have however not been so successful. Several attempts at learning unsupervised representations of sentences have not reached satisfactory enough performance to be widely adopted. In this paper, we show how universal sentence representations trained using the supervised data of the Stanford…
Figures and Tables from this paper
1,537 Citations
Supervised Contextual Embeddings for Transfer Learning in Natural Language Processing Tasks
- Computer ScienceArXiv
- 2019
This work focuses on extracting representations from multiple pre-trained supervised models, which enriches word embeddings with task and domain specific knowledge.
InferLite: Simple Universal Sentence Representations from Natural Language Inference Data
- Computer ScienceEMNLP
- 2018
A lightweight version of InferSent is proposed, called InferLite, that does not use any recurrent layers and operates on a collection of pre-trained word embeddings, and a semantic hashing layer is described that allows the model to learn generic binary codes for sentences.
DeCLUTR: Deep Contrastive Learning for Unsupervised Textual Representations
- Computer ScienceACL/IJCNLP
- 2021
Inspired by recent advances in deep metric learning (DML), this work carefully design a self-supervised objective for learning universal sentence embeddings that does not require labelled training data and closes the performance gap between unsupervised and supervised pretraining for universal sentence encoders.
Sentence embeddings in NLI with iterative refinement encoders
- Computer ScienceNatural Language Engineering
- 2019
This work proposes a hierarchy of bidirectional LSTM and max pooling layers that implements an iterative refinement strategy and yields state of the art results on the SciTail dataset as well as strong results for Stanford Natural Language Inference and Multi-Genre Natural language Inference.
Mining Discourse Markers for Unsupervised Sentence Representation Learning
- Computer ScienceNAACL
- 2019
This work proposes a method to automatically discover sentence pairs with relevant discourse markers, and applies it to massive amounts of data, to use as supervision for learning transferable sentence embeddings.
Training Effective Neural Sentence Encoders from Automatically Mined Paraphrases
- Computer Science
- 2022
A method for training effective language- specific sentence encoders without manually labeled data is proposed, to automatically construct a dataset of paraphrase pairs from sentence-aligned bilingual text corpora and used to tune a Transformer language model with an additional recurrent pooling layer.
Unsupervised Learning of Sentence Representations Using Sequence Consistency
- Computer ScienceArXiv
- 2018
This work proposes ConsSent, a simple yet surprisingly powerful unsupervised method to learn such representations by enforcing consistency constraints on sequences of tokens by training sentence encoders to distinguish between consistent and inconsistent examples.
TRANSFER LEARNING IN NATURAL LANGUAGE PRO-
- Computer Science
- 2019
This work focuses on extracting representations from multiple pre-trained supervised models, which enriches word embeddings with task and domain specific knowledge.
DisSent: Learning Sentence Representations from Explicit Discourse Relations
- Computer ScienceACL
- 2019
It is demonstrated that the automatically curated corpus allows a bidirectional LSTM sentence encoder to yield high quality sentence embeddings and can serve as a supervised fine-tuning dataset for larger models such as BERT.
DisSent: Sentence Representation Learning from Explicit Discourse Relations
- Computer ScienceArXiv
- 2017
It is demonstrated that the automatically curated corpus allows a bidirectional LSTM sentence encoder to yield high quality sentence embeddings and can serve as a supervised fine-tuning dataset for larger models such as BERT.
References
SHOWING 1-10 OF 47 REFERENCES
A large annotated corpus for learning natural language inference
- Computer ScienceEMNLP
- 2015
The Stanford Natural Language Inference corpus is introduced, a new, freely available collection of labeled sentence pairs, written by humans doing a novel grounded task based on image captioning, which allows a neural network-based model to perform competitively on natural language inference benchmarks for the first time.
Learning Distributed Representations of Sentences from Unlabelled Data
- Computer ScienceNAACL
- 2016
A systematic comparison of models that learn distributed phrase or sentence representations from unlabelled data finds that the optimal approach depends critically on the intended application.
Natural Language Processing (Almost) from Scratch
- Computer ScienceJ. Mach. Learn. Res.
- 2011
We propose a unified neural network architecture and learning algorithm that can be applied to various natural language processing tasks including part-of-speech tagging, chunking, named entity…
Towards Universal Paraphrastic Sentence Embeddings
- Computer ScienceICLR
- 2016
This work considers the problem of learning general-purpose, paraphrastic sentence embeddings based on supervision from the Paraphrase Database, and compares six compositional architectures, finding that the most complex architectures, such as long short-term memory (LSTM) recurrent neural networks, perform best on the in-domain data.
Skip-Thought Vectors
- Computer ScienceNIPS
- 2015
We describe an approach for unsupervised learning of a generic, distributed sentence encoder. Using the continuity of text from books, we train an encoder-decoder model that tries to reconstruct the…
A unified architecture for natural language processing: deep neural networks with multitask learning
- Computer ScienceICML '08
- 2008
We describe a single convolutional neural network architecture that, given a sentence, outputs a host of language processing predictions: part-of-speech tags, chunks, named entity tags, semantic…
A Simple but Tough-to-Beat Baseline for Sentence Embeddings
- Computer ScienceICLR
- 2017
Learning Natural Language Inference using Bidirectional LSTM model and Inner-Attention
- Computer ScienceArXiv
- 2016
A sentence encoding-based model for recognizing text entailment that utilized the sentence's first-stage representation to attend words appeared in itself, which is called "Inner-Attention" in this paper.
Enriching Word Vectors with Subword Information
- Computer ScienceTransactions of the Association for Computational Linguistics
- 2017
A new approach based on the skipgram model, where each word is represented as a bag of character n-grams, with words being represented as the sum of these representations, which achieves state-of-the-art performance on word similarity and analogy tasks.
Distributed Representations of Sentences and Documents
- Computer ScienceICML
- 2014
Paragraph Vector is an unsupervised algorithm that learns fixed-length feature representations from variable-length pieces of texts, such as sentences, paragraphs, and documents, and its construction gives the algorithm the potential to overcome the weaknesses of bag-of-words models.