Making Sense of Vision and Touch: Learning Multimodal Representations for Contact-Rich Tasks

  title={Making Sense of Vision and Touch: Learning Multimodal Representations for Contact-Rich Tasks},
  author={M. Lee and Yuke Zhu and Peter Zachares and Matthew Tan and K. Srinivasan and S. Savarese and Feifei Li and Animesh Garg and Jeannette Bohg},
  journal={IEEE Transactions on Robotics},
  • M. Lee, Yuke Zhu, +6 authors Jeannette Bohg
  • Published 2020
  • Computer Science
  • IEEE Transactions on Robotics
  • Contact-rich manipulation tasks in unstructured environments often require both haptic and visual feedback. It is nontrivial to manually design a robot controller that combines these modalities, which have very different characteristics. While deep reinforcement learning has shown success in learning control policies for high-dimensional inputs, these algorithms are generally intractable to train directly on real robots due to sample complexity. In this article, we use self-supervision to learn… CONTINUE READING
    9 Citations
    Robot Perception enables Complex Navigation Behavior via Self-Supervised Learning
    Tactile-Driven Grasp Stability and Slip Prediction
    • 3
    • PDF
    In-Hand Object Pose Tracking via Contact Feedback and GPU-Accelerated Robotic Simulation
    • 4
    • PDF
    Learning Precise 3D Manipulation from Multiple Uncalibrated Cameras
    • 3
    • PDF
    Cross-modal Non-linear Guided Attention and Temporal Coherence in Multi-modal Deep Video Models
    Cross-modal Learning for Multi-modal Video Categorization
    Towards Learning Controllable Representations of Physical Systems


    Making Sense of Vision and Touch: Self-Supervised Learning of Multimodal Representations for Contact-Rich Tasks
    • 88
    • PDF
    Stable reinforcement learning with autoencoders for tactile and visual data
    • 81
    • Highly Influential
    • PDF
    Learning to represent haptic feedback for partially-observable tasks
    • 24
    • PDF
    More Than a Feeling: Learning to Grasp and Regrasp Using Vision and Touch
    • 79
    • PDF
    Deep visual foresight for planning robot motion
    • Chelsea Finn, S. Levine
    • Computer Science
    • 2017 IEEE International Conference on Robotics and Automation (ICRA)
    • 2017
    • 372
    • PDF
    Manipulation by Feel: Touch-Based Control with Deep Predictive Models
    • 32
    • PDF
    Learning Dexterous In-Hand Manipulation
    • 125
    Learning dexterous in-hand manipulation
    • 495
    • PDF
    Learning robot in-hand manipulation with tactile features
    • 93
    • PDF
    See, feel, act: Hierarchical learning for complex manipulation skills with multisensory fusion
    • 18
    • PDF