Corpus ID: 224803679

Object Permanence Through Audio-Visual Representations

  title={Object Permanence Through Audio-Visual Representations},
  author={Fanjun Bu and Chien-Ming Huang},
As robots perform manipulation tasks and interact with objects, it is probable that they accidentally drop objects that subsequently bounce out of their visual fields (e.g., due to an inadequate grasp of an unfamiliar object). To enable robots to recover from such errors, we draw upon the concept of object permanence---objects remain in existence even when they are not being sensed (e.g., seen) directly. In particular, we developed a multimodal neural network model---using a partial, observed… Expand
1 Citations

Figures and Tables from this paper

The Boombox: Visual Reconstruction from Acoustic Vibrations


A Deep Convolutional Neural Network Model for Sense of Agency and Object Permanence in Robots
  • Claus Lang, G. Schillaci, V. Hafner
  • Computer Science
  • 2018 Joint IEEE 8th International Conference on Development and Learning and Epigenetic Robotics (ICDL-EpiRob)
  • 2018
First steps towards the development of the sense of object permanence in robots
On the sense of agency and of object permanence in robots
Mental imagery for a conversational robot
  • D. Roy, K. Hsiao, N. Mavridis
  • Computer Science, Medicine
  • IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics)
  • 2004
AVOT: Audio-Visual Object Tracking of Multiple Objects for Robotics
  • Justin Wilson, M. Lin
  • Computer Science
  • 2020 IEEE International Conference on Robotics and Automation (ICRA)
  • 2020
Coupling perception and simulation: steps towards conversational robotics
  • K. Hsiao, N. Mavridis, D. Roy
  • Computer Science
  • Proceedings 2003 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2003) (Cat. No.03CH37453)
  • 2003
Robotic Grasping of Novel Objects using Vision
Audio-Visual SLAM towards Human Tracking and Human-Robot Interaction in Indoor Environments
Audio Visual Attention Models in the Mobile Robots Navigation
Variational Bayesian Inference for Audio-Visual Tracking of Multiple Speakers