VisualCOMET: Reasoning About the Dynamic Context of a Still Image

@inproceedings{Park2020VisualCOMETRA,
  title={VisualCOMET: Reasoning About the Dynamic Context of a Still Image},
  author={J. S. Park and Chandra Bhagavatula and R. Mottaghi and A. Farhadi and Yejin Choi},
  booktitle={ECCV},
  year={2020}
}
Even from a single frame of a still image, people can reason about the dynamic story of the image before, after, and beyond the frame. For example, given an image of a man struggling to stay afloat in water, we can reason that the man fell into the water sometime in the past, the intent of that man at the moment is to stay alive, and he will need help in the near future or else he will get washed away. We propose VisualComet, the novel framework of visual commonsense reasoning tasks to predict… Expand
Transformation Driven Visual Reasoning
Learning Contextual Causality from Time-consecutive Images
Understanding Few-Shot Commonsense Knowledge Models
Understanding in Artificial Intelligence
KM-BART: Knowledge Enhanced Multimodal BART for Visual Commonsense Generation
Analyzing Commonsense Emergence in Few-shot Knowledge Models
  • 2021

References

SHOWING 1-10 OF 58 REFERENCES
From Recognition to Cognition: Visual Commonsense Reasoning
Inferring the Why in Images
Visual Dialog
Learning Common Sense through Visual Abstraction
VQA: Visual Question Answering
OK-VQA: A Visual Question Answering Benchmark Requiring External Knowledge
Show and tell: A neural image caption generator
Visual Dynamics: Probabilistic Future Frame Synthesis via Cross Convolutional Networks
"What Happens If..." Learning to Predict the Effect of Forces in Images
Flickr30k Entities: Collecting Region-to-Phrase Correspondences for Richer Image-to-Sentence Models
...
1
2
3
4
5
...