• Corpus ID: 235683469

SimNet: Enabling Robust Unknown Object Manipulation from Pure Synthetic Data via Stereo

  title={SimNet: Enabling Robust Unknown Object Manipulation from Pure Synthetic Data via Stereo},
  author={Thomas Kollar and Michael Laskey and Kevin Stone and Brijen Thananjeyan and Mark Tjersland},
: Robot manipulation of unknown objects in unstructured environments is a challenging problem due to the variety of shapes, materials, arrangements and lighting conditions. Even with large-scale real-world data collection, robust perception and manipulation of transparent and reflective objects across various lighting conditions remains challenging. To address these challenges we propose an approach to performing sim-to-real transfer of robotic perception. The underlying model, SimNet, is… 

Figures and Tables from this paper

CenterSnap: Single-Shot Multi-Object 3D Shape Reconstruction and Categorical 6D Pose and Size Estimation

This paper presents a simple one- stage approach to predict both the 3D shape and estimate the 6D pose and size jointly in a bounding-box free manner and significantly outperforms all shape completion and categorical 6D poses and size estimation baselines on multi-object ShapeNet and NOCS datasets respectively.

ShAPO: Implicit Representations for Multi-Object Shape, Appearance, and Pose Optimization

This work presents ShAPO, a method for joint multi-object detection, 3D textured reconstruction, 6D object pose and size estimation, which significantly out-performs all baselines on the NOCS dataset with an 8% absolute improvement in mAP for 6D pose estimation.

All You Need is LUV: Unsupervised Collection of Labeled Images using Invisible UV Fluorescent Indicators

Labels from UltraViolet (LUV), a novel framework that enables rapid, labeled data collection in real manipulation environments without human labeling, is proposed and it is shown that LUV provides labels consistent with human annotations on unpainted test images.

Solution of World Robot Challenge 2020 Partner Robot Challenge (Real Space)

The results presented in this paper demonstrate the practicality of domestic service robots in a well-developed environment and propose a motion synthesis method that simultaneously performs movement and posture transition to achieve high-speed motion.

Learning to Localize, Grasp, and Hand Over Unmodified Surgical Needles

This work proposes HOUSTON: Handover of Unmodified, Surgical, Tool-Obstructed Needles, a problem and algorithm that uses a learned active sensing policy with a stereo camera to iteratively localize and align the needle into a visible and accessible pose for the other gripper.

Autonomously Untangling Long Cables

—Cables are ubiquitous in many settings, but are prone to self-occlusions and knots, making them difficult to perceive and manipulate. The challenge often increases with cable length: long cables



Learning Rope Manipulation Policies Using Dense Object Descriptors Trained on Synthetic Depth Data

This work presents an approach that learns point-pair correspondences between initial and goal rope configurations, which implicitly encodes geometric structure, entirely in simulation from synthetic depth images, and demonstrates that the learned representation — dense depth object descriptors (DDODs) — can be used to manipulate a real rope into a variety of different arrangements.

Deep Object Pose Estimation for Semantic Robotic Grasping of Household Objects

This network is the first deep network trained only on synthetic data that is able to achieve state-of-the-art performance on 6-DoF object pose estimation and demonstrates a real-time system estimating object poses with sufficient accuracy for real-world semantic grasping of known household objects in clutter by a real robot.

KeyPose: Multi-View 3D Labeling and Keypoint Estimation for Transparent Objects

This paper establishes an easy method for capturing and labeling 3D keypoints on desktop objects with an RGB camera, and develops a deep neural network, called KeyPose, that learns to accurately predict object poses using 3DKeypoints, from stereo input, and works even for transparent objects.

Unknown Object Segmentation from Stereo Images

This work proposes a novel object instance segmentation approach that does not require any semantic or geometric information of the objects beforehand before it is computed, and uses the original image pair to infer the object instances and a dense depth map.

Clear Grasp: 3D Shape Estimation of Transparent Objects for Manipulation

ClearGrasp is substantially better than monocular depth estimation baselines and is capable of generalizing to real-world images and novel objects and can be applied out-of-the-box to improve grasping algorithms’ performance on transparent objects.

Dense Object Nets: Learning Dense Visual Object Descriptors By and For Robotic Manipulation

Dense Object Nets are presented, which build on recent developments in self-supervised dense descriptor learning, as a consistent object representation for visual understanding and manipulation and are demonstrated they can be trained quickly for a wide variety of previously unseen and potentially non-rigid objects.

The Best of Both Modes: Separately Leveraging RGB and Depth for Unseen Object Instance Segmentation

A novel method is proposed that separately leverages synthetic RGB and synthetic depth for unseen object instance segmentation, and is able to learn from synthetic RGB-D data where the RGB is non-photorealistic.

kPAM: KeyPoint Affordances for Category-Level Robotic Manipulation

A novel formulation of category-level manipulation that uses semantic 3D keypoints as the object representation enables a simple and interpretable specification of the manipulation target as geometric costs and constraints on the keypoints, which flexibly generalizes existing pose-based manipulation methods.

(CAD)$^2$RL: Real Single-Image Flight without a Single Real Image

This paper proposes a learning method that they call CAD$^2$RL, which can be used to perform collision-free indoor flight in the real world while being trained entirely on 3D CAD models, and shows that it can train a policy that generalizes to thereal world, without requiring the simulator to be particularly realistic or high-fidelity.

Camera-to-Robot Pose Estimation from a Single Image

This work presents an approach for estimating the pose of an external camera with respect to a robot using a single RGB image of the robot, capable of computing the camera extrinsics from a single frame, thus opening the possibility of on-line calibration.