News Headline Grouping as a Challenging NLU Task

  title={News Headline Grouping as a Challenging NLU Task},
  author={Philippe Laban and Lucas Bandarkar and Marti A. Hearst},
Recent progress in Natural Language Understanding (NLU) has seen the latest models outperform human performance on many standard tasks. These impressive results have led the community to introspect on dataset limitations, and iterate on more nuanced challenges. In this paper, we introduce the task of HeadLine Grouping (HLG) and a corresponding dataset (HLGD) consisting of 20,056 pairs of news headlines, each labeled with a binary judgement as to whether the pair belongs within the same group… Expand

Figures and Tables from this paper

Cross-Register Projection for Headline Part of Speech Tagging
This work automatically annotates news headlines with POS tags by projecting predicted tags from corresponding sentences in news bodies by training a multi-domain POS tagger on both long-form and headline text and shows that joint training on both registers improves over training on just one or naı̈vely concatenating training sets. Expand
HeadlineCause: A Dataset of News Headlines for Detecting Casualties
This work presents HeadlineCause, a dataset for detecting implicit causal relations between pairs of news headlines, and presents a set of models and experiments that demonstrates the dataset validity, including a multilingual XLM-RoBERTa based model for causality detection and a GPT-2 based models for possible effects prediction. Expand


Language Models are Unsupervised Multitask Learners
It is demonstrated that language models begin to learn these tasks without any explicit supervision when trained on a new dataset of millions of webpages called WebText, suggesting a promising path towards building language processing systems which learn to perform tasks from their naturally occurring demonstrations. Expand
SemEval-2017 Task 1: Semantic Textual Similarity Multilingual and Crosslingual Focused Evaluation
The STS Benchmark is introduced as a new shared training and evaluation set carefully selected from the corpus of English STS shared task data (2012-2017), providing insight into the limitations of existing models. Expand
Cycle-Consistency for Robust Visual Question Answering
A model-agnostic framework is proposed that trains a model to not only answer a question, but also generate a question conditioned on the answer, such that the answer predicted for the generated question is the same as the ground truth answer to the original question. Expand
GoodNewsEveryone: A Corpus of News Headlines Annotated with Emotions, Semantic Roles, and Reader Perception
A dataset of 5000 English news headlines annotated via crowdsourcing with their associated emotions, the corresponding emotion experiencers and textual cues, related emotion causes and targets, as well as the reader’s perception of the emotion of the headline is released. Expand
The Sixth PASCAL Recognizing Textual Entailment Challenge
This paper presents the Sixth Recognizing Textual Entailment (RTE-6) challenge, as the traditional Main Task was replaced by a new task, similar to the RTE-5 Search Pilot, in which TextualEntailment is performed on a real corpus in the Update Summarization scenario. Expand
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
A new language representation model, BERT, designed to pre-train deep bidirectional representations from unlabeled text by jointly conditioning on both left and right context in all layers, which can be fine-tuned with just one additional output layer to create state-of-the-art models for a wide range of tasks. Expand
A Neural Attention Model for Abstractive Sentence Summarization
This work proposes a fully data-driven approach to abstractive sentence summarization by utilizing a local attention-based model that generates each word of the summary conditioned on the input sentence. Expand
Extracting Lexically Divergent Paraphrases from Twitter
A new model suited to identify paraphrases within the short messages on Twitter, and a novel annotation methodology that has allowed us to crowdsource a paraphrase corpus from Twitter is presented. Expand
The Seventh PASCAL Recognizing Textual Entailment Challenge
This paper presents the Seventh Recognizing Textual Entailment (RTE-7) challenge, which replicated the exercise proposed in RTE-6, consisting of a Main Task, a Main subtask aimed at detecting novel information; and a KBP Validation Task, in which RTE systems had to validate the output of systems participating in the KBP Slot Filling Task. Expand
A Broad-Coverage Challenge Corpus for Sentence Understanding through Inference
The Multi-Genre Natural Language Inference corpus is introduced, a dataset designed for use in the development and evaluation of machine learning models for sentence understanding and shows that it represents a substantially more difficult task than does the Stanford NLI corpus. Expand