Visual Abductive Reasoning

  title={Visual Abductive Reasoning},
  author={Chen Liang and Wenguan Wang and Tianfei Zhou and Yi Yang},
Abductive reasoning seeks the likeliest possible explanation for partial observations. Although abduction is frequently employed in human daily reasoning, it is rarely ex-plored in computer vision literature. In this paper, we propose a new task and dataset, Visual Abductive Reasoning (VAR), for examining abductive reasoning ability of machine intelligence in everyday visual situations. Given an incomplete set of visual events, AI systems are required to not only describe what is observed, but… 


The Abduction of Sherlock Holmes: A Dataset for Visual Abductive Reasoning
This work presents Sherlock, an annotated corpus of 103K images for testing machine capacity for abductive reasoning beyond literal image contents, and collects 363K (clue, inference) pairs, which form a first-of-its-kind abductive visual reasoning dataset.
Abductive Commonsense Reasoning
This study introduces a challenge dataset, ART, that consists of over 20k commonsense narrative contexts and 200k explanations, and conceptualizes two new tasks -- Abductive NLI: a multiple-choice question answering task for choosing the more likely explanation, and Abduction NLG: a conditional generation task for explaining given observations in natural language.
Transformation Driven Visual Reasoning
A novel transformation driven visual reasoning task, where the target is to infer the corresponding single-step or multi-step transformation, represented as a triplet or a sequence of triplets, respectively, which will boost the development of machine visual reasoning.
From Recognition to Cognition: Visual Commonsense Reasoning
To move towards cognition-level understanding, a new reasoning engine is presented, Recognition to Cognition Networks (R2C), that models the necessary layered inferences for grounding, contextualization, and reasoning.
Visual abduction in Anthropology and Archaeology
The role of abductive reasoning in science has recently received much attention in the domains of Artificial Intelligence and Cognitive Science. Abduction, as characterized by the philosopher Charles
VisualCOMET: Reasoning About the Dynamic Context of a Still Image
This work proposes VisualComet, the novel framework of visual commonsense reasoning tasks to predict events thatmight have happened before, events that might happen next, and the intents of the people at present, and introduces the first large-scale repository of Visual Commonsense Graphs, allowing for tighter integration between images and text.
Anticipating Visual Representations from Unlabeled Video
This work presents a framework that capitalizes on temporal structure in unlabeled video to learn to anticipate human actions and objects and applies recognition algorithms on the authors' predicted representation to anticipate objects and actions.
What Is More Likely to Happen Next? Video-and-Language Future Event Prediction
This work collects a new dataset, named Video-and-Language Event Prediction (VLEP), with 28,726 future event prediction examples (along with their rationales) from 10,234 diverse TV Show and YouTube Lifestyle Vlog video clips, and presents a strong baseline incorporating information from video, dialogue, and commonsense knowledge.
Counterfactual Story Reasoning and Generation
This paper proposes Counterfactual Story Rewriting: given an original story and an intervening counterfactual event, the task is to minimally revise the story to make it compatible with the given counterfactually event.
Explanation and Abductive Inference
Everyday cognition reveals a sophisticated capacity to seek, generate, and evaluate explanations for the social and physical worlds around us. Why are we so driven to explain, and what accounts for