Corpus ID: 207870931

Closing the Reality Gap with Unsupervised Sim-to-Real Image Translation for Semantic Segmentation in Robot Soccer

  title={Closing the Reality Gap with Unsupervised Sim-to-Real Image Translation for Semantic Segmentation in Robot Soccer},
  author={Jan Blumenkamp and Andrea S. Baude and Tim Laue},
Deep learning approaches have become the standard solution to many problems in computer vision and robotics, but obtaining proper and sufficient training data is often a problem, as human labor is often error prone, time consuming and expensive. Solutions based on simulation have become more popular in recent years, but the gap between simulation and reality is still a major issue. In this paper, we introduce a novel model for augmenting synthetic image data through unsupervised image-to-image… Expand

Figures and Tables from this paper

Soccer Field Boundary Detection Using Convolutional Neural Networks
Detecting the field boundary is often one of the first steps in the vision pipeline of soccer robots. Conventional methods make use of a (possibly adaptive) green classifier, selection of boundaryExpand


Large-Scale Stochastic Scene Generation and Semantic Annotation for Deep Convolutional Neural Network Training in the RoboCup SPL
A framework for stochastic scene generation, rendering and automatic creation of semantically annotated ground truth masks is proposed for RoboCup standard platform league and demonstrated compelling classification accuracy on real-world data in a multi-class setting. Expand
Augmented Reality meets Deep Learning
This work proposes an alternative paradigm which combines real and synthetic data for learning semantic instance segmentation models, and demonstrates the utility of the proposed approach for training a state-of-the-art high-capacity deep model for semantic instances segmentation. Expand
The SYNTHIA Dataset: A Large Collection of Synthetic Images for Semantic Segmentation of Urban Scenes
This paper generates a synthetic collection of diverse urban images, named SYNTHIA, with automatically generated class annotations, and conducts experiments with DCNNs that show how the inclusion of SYnTHIA in the training stage significantly improves performance on the semantic segmentation task. Expand
Training Deep Networks with Synthetic Data: Bridging the Reality Gap by Domain Randomization
This work presents a system for training deep neural networks for object detection using synthetic images that relies upon the technique of domain randomization, in which the parameters of the simulator are randomized in non-realistic ways to force the neural network to learn the essential features of the object of interest. Expand
Real-Time Scene Understanding Using Deep Neural Networks for RoboCup SPL
An end-to-end neural network solution to scene understanding for robot soccer and RoboDNN, a C++ neural network library designed for fast inference on the Nao robots are presented. Expand
Using Simulation and Domain Adaptation to Improve Efficiency of Deep Robotic Grasping
This work study how randomized simulated environments and domain adaptation methods can be extended to train a grasping system to grasp novel objects from raw monocular RGB images, including a novel extension of pixel-level domain adaptation that is term the GraspGAN. Expand
Deep Learning for Semantic Segmentation on Minimal Hardware
This work proposes an approach conceptually different from those taken previously based on semantic segmentation that is being able to process full VGA images in real-time on a low-power mobile processor and is applicable on a minimal mobile hardware. Expand
Playing for Data: Ground Truth from Computer Games
It is shown that associations between image patches can be reconstructed from the communication between the game and the graphics hardware, which enables rapid propagation of semantic labels within and across images synthesized by the game, with no access to the source code or the content. Expand
Learning to Drive from Simulation without Real World Labels
This work presents a method for transferring a vision-based lane following driving policy from simulation to operation on a rural road without any real-world labels and assesses the driving performance using both open-loop regression metrics, and closed-loop performance operating an autonomous vehicle on rural and urban roads. Expand
Full-Resolution Residual Networks for Semantic Segmentation in Street Scenes
This work proposes a novel ResNet-like architecture that exhibits strong localization and recognition performance, and combines multi-scale context with pixel-level accuracy by using two processing streams within the network. Expand