Newtonian Image Understanding: Unfolding the Dynamics of Objects in Static Images

Abstract

In this paper, we study the challenging problem of predicting the dynamics of objects in static images. Given a query object in an image, our goal is to provide a physical understanding of the object in terms of the forces acting upon it and its long term motion as response to those forces. Direct and explicit estimation of the forces and the motion of objects from a single image is extremely challenging. We define intermediate physical abstractions called Newtonian scenarios and introduce Newtonian Neural Network (N3) that learns to map a single image to a state in a Newtonian scenario. Our evaluations show that our method can reliably predict dynamics of a query object from a single image. In addition, our approach can provide physical reasoning that supports the predicted dynamics in terms of velocity and force vectors. To spur research in this direction we compiled Visual Newtonian Dynamics (VIND) dataset that includes more than 6000 videos aligned with Newtonian scenarios represented using game engines, and more than 4500 still images with their ground truth dynamics.

DOI: 10.1109/CVPR.2016.383
View Slides

Extracted Key Phrases

02040201520162017
Citations per Year

Citation Velocity: 18

Averaging 18 citations per year over the last 3 years.

Learn more about how we calculate this metric in our FAQ.

Cite this paper

@article{Mottaghi2016NewtonianIU, title={Newtonian Image Understanding: Unfolding the Dynamics of Objects in Static Images}, author={Roozbeh Mottaghi and Hessam Bagherinezhad and Mohammad Rastegari and Ali Farhadi}, journal={2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)}, year={2016}, pages={3521-3529} }