Scene Editing as Teleoperation: A Case Study in 6DoF Kit Assembly

  title={Scene Editing as Teleoperation: A Case Study in 6DoF Kit Assembly},
  author={Shubh Agrawal and Yulong Li and Jen-Shuo Liu and Steven K. Feiner and Shuran Song},
  journal={2022 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)},
  • Shubh AgrawalYulong Li Shuran Song
  • Published 9 October 2021
  • Computer Science
  • 2022 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)
Studies in robot teleoperation have been centered around action specifications-from continuous joint control to discrete end-effector pose control. However, these “robot-centric” interfaces often require skilled operators with extensive robotics expertise. To make teleoperation accessible to nonexpert users, we propose the framework “Scene Editing as Teleoperation” (SEaT), where the key idea is to transform the traditional “robot-centric” interface into a “scene-centric” interface-instead of… 

Figures and Tables from this paper



Kit-Net: Self-Supervised Learning to Kit Novel 3D Objects into Novel 3D Cavities

Kit-Net is presented, a framework for kitting previously unseen 3D objects into cavities given depth images of both the target cavity and an object held by a gripper in an unknown initial orientation, using self-supervised deep learning and dataaugmentation to train a convolutional neural network.

Transporter Networks: Rearranging the Visual World for Robotic Manipulation

The Transporter Network is proposed, a simple model architecture that rearranges deep features to infer spatial displacements from visual input - which can parameterize robot actions and learns faster and generalizes better than a variety of end-to-end baselines, including policies that use ground-truth object poses.

Nasa-Task Load Index (NASA-TLX); 20 Years Later

The goal was to summarize the environments in which NASA-TLX has been applied, the types of activities the raters performed, other variables that were measured that did (or did not) covary, methodological issues, and lessons learned.

ReorientBot: Learning Object Reorientation for Specific-Posed Placement

This work presents a vision-based manipulation system, ReorientBot, which consists of visual scene understanding with pose estimation and volumetric reconstruction using an onboard RGB-D camera, learned waypoint selection for successful and efficient motion generation for reorientation, and traditional motion planning to generate a collision-free trajectory from the selected waypoints.

UMPNet: Universal Manipulation Policy Network for Articulated Objects

A novel Arrow-of-Time action attribute is introduced that indicates whether an action will change the object state back to the past or forward into the future, enabling both effective state exploration and goal-conditioned manipulation.

Act the Part: Learning Interaction Strategies for Articulated Object Part Discovery

Act the Part is introduced to learn how to interact with articulated objects to discover and segment their pieces by coupling action selection and motion segmentation, and is able to isolate structures to make perceptual part recovery possible without semantic labels.

SSCNav: Confidence-Aware Semantic Scene Completion for Visual Semantic Navigation

This paper introduces SSCNav, an algorithm that explicitly models scene priors using a confidence-aware semantic scene completion module to complete the scene and guide the agent's navigation planning and demonstrates that the proposed scenepletion module improves the efficiency of the downstream navigation policies.

Learning 3D Dynamic Scene Representations for Robot Manipulation

This paper introduces 3D Dynamic Scene Representation (DSR), a 3D volumetric scene representation that simultaneously discovers, tracks, reconstructs objects, and predicts their dynamics while capturing all three properties, and proposes DSR-Net, which learns to aggregate visual observations over multiple interactions to gradually build and refine DSR.

Rearrangement: A Challenge for Embodied AI

A framework for research and evaluation in Embodied AI is described, based on a canonical task: Rearrangement, that can focus the development of new techniques and serve as a source of trained models that can be transferred to other settings.

Leveraging depth data in remote robot teleoperation interfaces for general object manipulation

The results of this study show that point-and-click outperforms both free positioning and constrained positioning by significantly increasing the number of tasks completed and significantly reducing task failures and grasping errors, while significantly reducing the numberof user interactions required to specify poses.