Corpus ID: 199000781

The Best of Both Modes: Separately Leveraging RGB and Depth for Unseen Object Instance Segmentation

@inproceedings{Xie2019TheBO,
  title={The Best of Both Modes: Separately Leveraging RGB and Depth for Unseen Object Instance Segmentation},
  author={Christopher Xie and Yu Xiang and Arsalan Mousavian and Dieter Fox},
  booktitle={CoRL},
  year={2019}
}
In order to function in unstructured environments, robots need the ability to recognize unseen novel objects. We take a step in this direction by tackling the problem of segmenting unseen object instances in tabletop environments. However, the type of large-scale real-world dataset required for this task typically does not exist for most robotic settings, which motivates the use of synthetic data. We propose a novel method that separately leverages synthetic RGB and synthetic depth for unseen… Expand
Unseen Object Amodal Instance Segmentation via Hierarchical Occlusion Modeling
TLDR
A Hierarchical Occlusion Modeling (HOM) scheme designed to reason about the occlusion by assigning a hierarchy to a feature fusion and prediction order is proposed. Expand
Learning to Better Segment Objects from Unseen Classes with Unlabeled Videos
TLDR
This paper introduces a Bayesian method that is specifically designed to automatically create a high-quality training set which significantly boosts the performance of segmenting objects of unseen classes and could open the door for open-world instance segmentation using abundant Internet videos. Expand
End-to-end Trainable Deep Neural Network for Robotic Grasp Detection and Semantic Segmentation from RGB
In this work, we introduce a novel, end-to-end trainable CNN-based architecture to deliver high quality results for grasp detection suitable for a parallel-plate gripper, and semantic segmentation.Expand
Unknown Object Segmentation from Stereo Images
TLDR
This work proposes a novel object instance segmentation approach that does not require any semantic or geometric information of the objects beforehand before it is computed, and employs a transformer-based architecture that maps directly from the pair of input images to the object instances. Expand
RICE: Refining Instance Masks in Cluttered Environments with Graph Neural Networks
TLDR
A novel framework that refines the output of such methods by utilizing a graph-based representation of instance masks is proposed, and an application is demonstrated that uses uncertainty estimates generated by the method to guide a manipulator, leading to efficient understanding of cluttered scenes. Expand
SimNet: Enabling Robust Unknown Object Manipulation from Pure Synthetic Data via Stereo
TLDR
SimNet can be used to perform end-to-end manipulation of unknown objects in both ”easy” and ”hard” scenarios using the authors' fleet of Toyota HSR robots in four home environments and significantly outperforms a baseline that uses a structured light RGB-D sensor. Expand
ZePHyR: Zero-shot Pose Hypothesis Rating
TLDR
A novel method for zeroshot object pose estimation in clutter that achieves zero-shot generalization by rating hypotheses as a function of unordered point differences and allows users to estimate the pose of novel objects without requiring any retraining. Expand
Self-Supervised Audio-Visual Feature Learning for Single-Modal Incremental Terrain Type Clustering
TLDR
A novel framework using the multi-modal variational autoencoder and the Gaussian mixture model clustering algorithm on image data and audio data for terrain type clustering by forcing the features to be closer together in the feature space is presented. Expand
Towards Object-generic 6D Pose Estimation
Pose estimation is a basic module in many robot manipulation pipelines. Estimating the pose of objects in the environment can be useful for grasping, motion planning, or manipulation. However,Expand
Scalable, physics-aware 6D pose estimation for robot manipulation
OF THE DISSERTATION Scalable, Physics-aware 6D Pose Estimation for Robot Manipulation by CHAITANYA MITASH Dissertation Director: Abdeslam Boularias, Kostas E. Bekris Robot Manipulation often dependExpand
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 44 REFERENCES
Deep Object Pose Estimation for Semantic Robotic Grasping of Household Objects
TLDR
This network is the first deep network trained only on synthetic data that is able to achieve state-of-the-art performance on 6-DoF object pose estimation and demonstrates a real-time system estimating object poses with sufficient accuracy for real-world semantic grasping of known household objects in clutter by a real robot. Expand
Segmenting Unknown 3D Objects from Real Depth Images using Mask R-CNN Trained on Synthetic Data
TLDR
A method for automated dataset generation is presented and a variant of Mask R-CNN is trained with domain randomization on the generated dataset to perform category-agnostic instance segmentation without any hand-labeled data and the model is deployed in an instance-specific grasping pipeline to demonstrate its usefulness in a robotics application. Expand
Towards Segmenting Anything That Moves
TLDR
This work proposes a simple learning-based approach for spatio-temporal grouping that leverages motion cues from optical flow as a bottom-up signal for separating objects from each other, and shows that this model matches top-down methods on common categories, while significantly out-performing both top- down and bottom- up methods on never-before-seen categories. Expand
Instance Segmentation by Jointly Optimizing Spatial Embeddings and Clustering Bandwidth
TLDR
This work proposes a new clustering loss function for proposal-free instance segmentation that pulls the spatial embeddings of pixels belonging to the same instance together and jointly learns an instance-specific clustering bandwidth, maximizing the intersection-over-union of the resulting instance mask. Expand
SceneCut: Joint Geometric and Object Segmentation for Indoor Scenes
TLDR
sceneCut's joint reasoning over scene semantics and geometry allows a robot to detect and segment object instances in complex scenes where modern deep learning-based methods either fail to separate object instances, or fail to detect objects that were not seen during training. Expand
Semantic Instance Segmentation with a Discriminative Loss Function
TLDR
This work proposes an approach of combining an off-the-shelf network with a principled loss function inspired by a metric learning objective that encourages a convolutional network to produce a representation of the image that can easily be clustered into instances with a simple post-processing step. Expand
ShapeMask: Learning to Segment Novel Objects by Refining Shape Priors
TLDR
ShapeMask is introduced, which learns the intermediate concept of object shape to address the problem of generalization in instance segmentation to novel categories and significantly outperforms the state-of-the-art when learning across categories. Expand
Learning to Refine Object Segments
TLDR
This work proposes to augment feedforward nets for object segmentation with a novel top-down refinement approach that is capable of efficiently generating high-fidelity object masks and is 50 % faster than the original DeepMask network. Expand
EasyLabel: A Semi-Automatic Pixel-wise Object Annotation Tool for Creating Robotic RGB-D Datasets
TLDR
This paper presents the EasyLabel tool for easily acquiring high-quality ground truth annotation of objects at pixel-level in densely cluttered scenes and reveals the usefulness of EasyLabel and OCID to better understand the challenges that robots face in the real world. Expand
Adapting Deep Visuomotor Representations with Weak Pairwise Constraints
TLDR
This work proposes a novel domain adaptation approach for robot perception that adapts visual representations learned on a large easy-to-obtain source dataset to a target real-world domain, without requiring expensive manual data annotation of real world data before policy search. Expand
...
1
2
3
4
5
...