Exploring Lexical Irregularities in Hypothesis-Only Models of Natural Language Inference

  title={Exploring Lexical Irregularities in Hypothesis-Only Models of Natural Language Inference},
  author={Qingyuan Hu and Yi Zhang and Kanishka Misra and Julia Taylor Rayz},
  journal={2020 IEEE 19th International Conference on Cognitive Informatics \& Cognitive Computing (ICCI*CC)},
  • Qingyuan HuYi Zhang J. Rayz
  • Published 26 September 2020
  • Computer Science
  • 2020 IEEE 19th International Conference on Cognitive Informatics & Cognitive Computing (ICCI*CC)
Natural Language Inference (NLI) or Recognizing Textual Entailment (RTE) is the task of predicting the entailment relation between a pair of sentences (premise and hypothesis). This task has been described as “a valuable testing ground for the development of semantic representations” [1], and is a key component in natural language understanding evaluation benchmarks. Models that understand entailment should encode both, the premise and the hypothesis. However, experiments by Poliak et al. [2… 

Tables from this paper



Hypothesis Only Baselines in Natural Language Inference

This approach, which is referred to as a hypothesis-only model, is able to significantly outperform a majority-class baseline across a number of NLI datasets and suggests that statistical irregularities may allow a model to perform NLI in some datasets beyond what should be achievable without access to the context.

Learning Natural Language Inference with LSTM

A special long short-term memory (LSTM) architecture for NLI that remembers important mismatches that are critical for predicting the contradiction or the neutral relationship label and achieves an accuracy of 86.1%, outperforming the state of the art.

A large annotated corpus for learning natural language inference

The Stanford Natural Language Inference corpus is introduced, a new, freely available collection of labeled sentence pairs, written by humans doing a novel grounded task based on image captioning, which allows a neural network-based model to perform competitively on natural language inference benchmarks for the first time.

Right for the Wrong Reasons: Diagnosing Syntactic Heuristics in Natural Language Inference

There is substantial room for improvement in NLI systems, and the HANS dataset can motivate and measure progress in this area, which contains many examples where the heuristics fail.

A Broad-Coverage Challenge Corpus for Sentence Understanding through Inference

The Multi-Genre Natural Language Inference corpus is introduced, a dataset designed for use in the development and evaluation of machine learning models for sentence understanding and shows that it represents a substantially more difficult task than does the Stanford NLI corpus.

An Inference-Based Approach to Recognizing Entailment

It is argued that forming semantic representations is a necessary first step towards the larger goal of machine reading, and worthy of further exploration.

A Phrase-Based Alignment Model for Natural Language Inference

The MANLI system is presented, a new NLI aligner designed to address the alignment problem, which uses a phrase-based alignment representation, exploits external lexical resources, and capitalizes on a new set of supervised training data.

SciTaiL: A Textual Entailment Dataset from Science Question Answering

A new dataset and model for textual entailment, derived from treating multiple-choice question-answering as an entailment problem, is presented, and it is demonstrated that one can improve accuracy on SCITAIL by 5% using a new neural model that exploits linguistic structure.

Supervised Learning of Universal Sentence Representations from Natural Language Inference Data

It is shown how universal sentence representations trained using the supervised data of the Stanford Natural Language Inference datasets can consistently outperform unsupervised methods like SkipThought vectors on a wide range of transfer tasks.

Inference is Everything: Recasting Semantic Resources into a Unified Evaluation Framework

A general strategy to automatically generate one or more sentential hypotheses based on an input sentence and pre-existing manual semantic annotations is presented, which enables us to probe a statistical RTE model’s performance on different aspects of semantics.