“Going on a vacation” takes longer than “Going for a walk”: A Study of Temporal Commonsense Understanding

  title={“Going on a vacation” takes longer than “Going for a walk”: A Study of Temporal Commonsense Understanding},
  author={Ben Zhou and Daniel Khashabi and Qiang Ning and Dan Roth},
Understanding time is crucial for understanding events expressed in natural language. Because people rarely say the obvious, it is often necessary to have commonsense knowledge about various temporal aspects of events, such as duration, frequency, and temporal order. However, this important problem has so far received limited attention. This paper systematically studies this temporal commonsense problem. Specifically, we define five classes of temporal commonsense, and use crowdsourcing to… 

Figures and Tables from this paper

TIMEDIAL: Temporal Commonsense Reasoning in Dialog

This paper presents the first study to investigate pre-trained LMs for their temporal reasoning capabilities in dialogs by introducing a new task and a crowd-sourced English challenge set, TimeDial, and reveals that the models fail to reason about dialog context correctly; instead, they rely on shallow cues based on existing temporal patterns in context.

ForecastQA: Machine Comprehension of Temporal Text for Answering Forecasting Questions

This paper introduces ForecastQA, a new open-domain question answering dataset consisting of 10k questions which requires temporal reasoning, and presents baseline models for the dataset, which is based on a pre-trained language model.

Mitigating Reporting Bias in Semi-supervised Temporal Commonsense Inference with Probabilistic Soft Logic

A novel neural-logic based Soft Logic Enhanced Event Temporal Reasoning (SLEER) model is proposed for acquiring unbiased TCS knowledge, in which the complementary relationship among dimensions are explicitly represented as logic rules and modeled by t-norm fuzzy logics.

Commonsense Reasoning for Natural Language Processing

This tutorial organizes this tutorial to provide researchers with the critical foundations and recent advances in commonsense representation and reasoning, in the hopes of casting a brighter light on this promising area of future research.

A Meta-framework for Spatiotemporal Quantity Extraction from Text

This paper formulates the NLP problem of spatiotemporal quantity extraction, and proposes the first meta-framework for solving it, which contains a formalism that decomposes the problem into several information extraction tasks, a shareable crowdsourcing pipeline, and transformer-based baseline models.

Time-Aware Language Models as Temporal Knowledge Bases

This work proposes a simple technique for jointly modeling text with its timestamp that improves memorization of seen facts from the training time period, as well as calibration on predictions about unseen facts from future time periods and shows that models trained with temporal context can be efficiently "refreshed" as new data arrives.

TORQUE: A Reading Comprehension Dataset of Temporal Ordering Questions

TORQUE is introduced, a new English reading comprehension benchmark built on 3.2k news snippets with 21k human-generated questions querying temporal relationships, and results show that RoBERTa-large achieves an exact-match score of 51% on the test set of TORQUE, about 30% behind human performance.

Research Statement: Climbing the Generality Ladder in NLP

The overarching theme of the research is centered around developing algorithms and theories that make natural language processing systems more general and generalizable, i.e., enabling them to adapt and handle a broader space of challenges or situations.

Don't Blame the Annotator: Bias Already Starts in the Annotation Instructions

This work hypothesizes that annotators pick up on patterns in the crowdsourcing instructions, which bias them to write similar examples that are then over-represented in the collected data, and studies this form of bias, termed instruction bias, in 14 recent NLU benchmarks, showing that instruction examples often exhibit concrete patterns.

Cross-Task Generalization via Natural Language Crowdsourcing Instructions

This work introduces NATURAL INSTRUCTIONS, a dataset of 61 distinct tasks, their human-authored instructions, and 193k task instances, and adopts generative pre-trained language models to encode task-specific instructions along with input and generate task output.



Event2Mind: Commonsense Inference on Events, Intents, and Reactions

It is demonstrated how commonsense inference on people’s intents and reactions can help unveil the implicit gender inequality prevalent in modern movie scripts.

Using Query Patterns to Learn the Duration of Events

This work describes and improves a supervised baseline that relies on event duration annotations, and shows how web queries for linguistic patterns can help learn the duration of events without labeled data, producing fine-grained duration judgments that surpass the supervised system.

Commonsense for Generative Multi-Hop Question Answering Tasks

This work focuses on a more challenging multi-hop generative task (NarrativeQA), which requires the model to reason, gather, and synthesize disjoint pieces of information within the context to generate an answer.

Looking Beyond the Surface: A Challenge Set for Reading Comprehension over Multiple Sentences

The dataset is the first to study multi-sentence inference at scale, with an open-ended set of question types that requires reasoning skills, and finds human solvers to achieve an F1-score of 88.1%.

SWAG: A Large-Scale Adversarial Dataset for Grounded Commonsense Inference

This paper introduces the task of grounded commonsense inference, unifying natural language inference and commonsense reasoning, and proposes Adversarial Filtering (AF), a novel procedure that constructs a de-biased dataset by iteratively training an ensemble of stylistic classifiers, and using them to filter the data.

Reasoning about Actions and State Changes by Injecting Commonsense Knowledge

This paper shows how the predicted effects of actions in the context of a paragraph can be improved in two ways: by incorporating global, commonsense constraints (e.g., a non-existent entity cannot be destroyed), and by biasing reading with preferences from large-scale corpora.

Verb Physics: Relative Physical Knowledge of Actions and Objects

An approach to infer relative physical knowledge of actions and objects along five dimensions (e.g., size, weight, and strength) from unstructured natural language text is presented.

A Multi-Axis Annotation Scheme for Event Temporal Relations

A new multi-axis modeling to better capture the temporal structure of events is proposed and it is identified that event end-points are a major source of confusion in annotation, so it is proposed to annotate TempRels based on start-points only.

What Happens Next? Event Prediction Using a Compositional Neural Network Model

It is found that recent work on learning vector-space embeddings to capture word meaning can be effectively applied to this task, including simple incorporation of a verb's arguments in the representation by vector addition.

A Structured Learning Approach to Temporal Relation Extraction

It is suggested that it is important to take dependencies into account while learning to identify temporal relations between events and a structured learning approach is proposed to address this challenge.