A Broad-Coverage Challenge Corpus for Sentence Understanding through Inference

@inproceedings{Williams2018ABC,
  title={A Broad-Coverage Challenge Corpus for Sentence Understanding through Inference},
  author={Adina Williams and Nikita Nangia and Samuel R. Bowman},
  booktitle={NAACL},
  year={2018}
}
This paper introduces the Multi-Genre Natural Language Inference (MultiNLI) corpus, a dataset designed for use in the development and evaluation of machine learning models for sentence understanding. At 433k examples, this resource is one of the largest corpora available for natural language inference (a.k.a. recognizing textual entailment), improving upon available resources in both its coverage and difficulty. MultiNLI accomplishes this by offering data from ten distinct genres of written and… 

Figures and Tables from this paper

XNLI: Evaluating Cross-lingual Sentence Representations
TLDR
This work constructs an evaluation set for XLU by extending the development and test sets of the Multi-Genre Natural Language Inference Corpus to 14 languages, including low-resource languages such as Swahili and Urdu and finds that XNLI represents a practical and challenging evaluation suite and that directly translating the test data yields the best performance among available baselines.
Explaining Simple Natural Language Inference
TLDR
The experiment reveals several problems in the annotation guidelines, and various challenges of the NLI task itself, and leads to recommendations for future annotation tasks, for NLI and possibly for other tasks.
Generating Token-Level Explanations for Natural Language Inference
TLDR
It is shown that it is possible to generate token-level explanations for NLI without the need for training data explicitly annotated for this purpose, using a simple LSTM architecture and evaluating both LIME and Anchor explanations for this task.
A Pragmatics-Centered Evaluation Framework for Natural Language Understanding
TLDR
It is shown that natural language inference, a widely used pretraining task, does not result in genuinely universal representations, which presents a new challenge for multi-task learning.
A New Dataset for Natural Language Inference from Code-mixed Conversations
TLDR
This paper presents the first dataset for code-mixed NLI, in which both the premises and hypotheses are in code- mixed Hindi-English, and uses data from Hindi movies (Bollywood) as premises, and crowd-source hypotheses from Hindi- English bilinguals.
XLDA: Cross-Lingual Data Augmentation for Natural Language Inference and Question Answering
While natural language processing systems often focus on a single language, multilingual transfer learning has the potential to improve performance, especially for low-resource languages. We
SciNLI: A Corpus for Natural Language Inference on Scientific Text
TLDR
SciNLI is a large dataset for NLI that captures the formality in scientific text and contains 107,412 sentence pairs extracted from scholarly papers on NLP and computational linguistics, well suited to serve as a benchmark for the evaluation of scientific NLU models.
Probing Multilingual Language Models for Discourse
TLDR
It is found that the XLM-RoBERTa family of models consistently show the best performance, by simultaneously being good monolingual models and degrading relatively little in a zero-shot setting.
Language Models for Lexical Inference in Context
TLDR
Three approaches based on pretrained language models based on handcrafted patterns expressing the semantics of lexical inference outperform the previous state of the art and show the potential of pretrained LMs for LIiC.
Towards a Unified Natural Language Inference Framework to Evaluate Sentence Representations
TLDR
This work generates a large-scale NLI dataset by recasting 11 existing datasets from 7 different semantic tasks, and uses this dataset of approximately half a million context-hypothesis pairs to test how well sentence encoders capture distinct semantic phenomena that are necessary for general language understanding.
...
...

References

SHOWING 1-10 OF 67 REFERENCES
A large annotated corpus for learning natural language inference
TLDR
The Stanford Natural Language Inference corpus is introduced, a new, freely available collection of labeled sentence pairs, written by humans doing a novel grounded task based on image captioning, which allows a neural network-based model to perform competitively on natural language inference benchmarks for the first time.
Supervised Learning of Universal Sentence Representations from Natural Language Inference Data
TLDR
It is shown how universal sentence representations trained using the supervised data of the Stanford Natural Language Inference datasets can consistently outperform unsupervised methods like SkipThought vectors on a wide range of transfer tasks.
The American National Corpus: A Standardized Resource for American English
TLDR
This work has shown that corpusbased natural language processing has relied heavily on language samples representative of usage in a handful of limited and linguistically specialized domains.
Natural language inference
TLDR
This dissertation explores a range of approaches to NLI, beginning with methods which are robust but approximate, and proceeding to progressively more precise approaches, and greatly extends past work in natural logic to incorporate both semantic exclusion and implicativity.
Learning Natural Language Inference with LSTM
TLDR
A special long short-term memory (LSTM) architecture for NLI that remembers important mismatches that are critical for predicting the contradiction or the neutral relationship label and achieves an accuracy of 86.1%, outperforming the state of the art.
Enhanced LSTM for Natural Language Inference
TLDR
This paper presents a new state-of-the-art result, achieving the accuracy of 88.6% on the Stanford Natural Language Inference Dataset, and demonstrates that carefully designing sequential inference models based on chain LSTMs can outperform all previous models.
Building a Large Annotated Corpus of English: The Penn Treebank
TLDR
As a result of this grant, the researchers have now published on CDROM a corpus of over 4 million words of running text annotated with part-of- speech (POS) tags, which includes a fully hand-parsed version of the classic Brown corpus.
Recognising Textual Entailment with Logical Inference
TLDR
This work incorporates model building, a technique borrowed from automated reasoning, and shows that it is a useful robust method to approximate entailment, and uses machine learning to combine these deep semantic analysis techniques with simple shallow word overlap.
A SICK cure for the evaluation of compositional distributional semantic models
TLDR
This work aims to help the research community working on compositional distributional semantic models (CDSMs) by providing SICK (Sentences Involving Compositional Knowldedge), a large size English benchmark tailored for them.
A Fast Unified Model for Parsing and Sentence Understanding
TLDR
The Stack-augmentedParser-Interpreter NeuralNetwork (SPINN) combines parsing and interpretation within a single tree-sequence hybrid model by integrating tree-structured sentence interpretation into the linear sequential structure of a shiftreduceparser.
...
...