Visual Semantic Planning Using Deep Successor Representations

  title={Visual Semantic Planning Using Deep Successor Representations},
  author={Yuke Zhu and Daniel Gordon and Eric Kolve and D. Fox and Li Fei-Fei and A. Gupta and R. Mottaghi and Ali Farhadi},
  journal={2017 IEEE International Conference on Computer Vision (ICCV)},
  • Yuke Zhu, Daniel Gordon, +5 authors Ali Farhadi
  • Published 2017
  • Computer Science
  • 2017 IEEE International Conference on Computer Vision (ICCV)
  • A crucial capability of real-world intelligent agents is their ability to plan a sequence of actions to achieve their goals in the visual world. In this work, we address the problem of visual semantic planning: the task of predicting a sequence of actions from visual observations that transform a dynamic environment from an initial state to a goal state. Doing so entails knowledge about objects and their affordances, as well as actions and their preconditions and effects. We propose learning… CONTINUE READING

    Figures, Tables, and Topics from this paper.

    Embodied Question Answering
    • 222
    • Highly Influenced
    • PDF
    Scene Memory Transformer for Embodied Agents in Long-Horizon Tasks
    • 26
    • PDF
    What Should I Do Now? Marrying Reinforcement Learning and Symbolic Planning
    • 5
    • PDF
    Learning to Navigate Unseen Environments: Back Translation with Environmental Dropout
    • 43
    • PDF
    Learning to guide task and motion planning using score-space representation
    • 23
    • PDF


    Publications referenced by this paper.
    Target-driven visual navigation in indoor scenes using deep reinforcement learning
    • 656
    • PDF
    Continuous control with deep reinforcement learning
    • 3,529
    • PDF
    End-to-End Training of Deep Visuomotor Policies
    • 1,799
    • PDF
    The Curious Robot: Learning Visual Representations via Physical Interactions
    • 128
    • PDF
    Actions ~ Transformations
    • 159
    • PDF
    Learning to Act by Predicting the Future
    • 185
    • PDF
    Inferring The Latent Structure of Human Decision-Making from Raw Visual Inputs
    • 34
    • PDF
    Human-level control through deep reinforcement learning
    • 9,811
    • Highly Influential
    • PDF