Annotation Artifacts in Natural Language Inference Data

@inproceedings{Gururangan2018AnnotationAI,
  title={Annotation Artifacts in Natural Language Inference Data},
  author={Suchin Gururangan and Swabha Swayamdipta and Omer Levy and Roy Schwartz and Samuel R. Bowman and Noah A. Smith},
  booktitle={NAACL-HLT},
  year={2018}
}
Large-scale datasets for natural language inference are created by presenting crowd workers with a sentence (premise), and asking them to generate three new sentences (hypotheses) that it entails, contradicts, or is logically neutral with respect to. We show that, in a significant portion of such data, this protocol leaves clues that make it possible to identify the label by looking only at the hypothesis, without observing the premise. Specifically, we show that a simple text categorization… CONTINUE READING

From This Paper

Topics from this paper.

Citations

Publications citing this paper.
SHOWING 1-10 OF 62 CITATIONS, ESTIMATED 44% COVERAGE

62 Citations

020402016201720182019
Citations per Year
Semantic Scholar estimates that this publication has 62 citations based on the available data.

See our FAQ for additional information.

Similar Papers

Loading similar papers…