Virtual-to-Real: Learning to Control in Visual Semantic Segmentation

@inproceedings{Hong2018VirtualtoRealLT,
  title={Virtual-to-Real: Learning to Control in Visual Semantic Segmentation},
  author={Zhang-Wei Hong and Yuming Chen and Shih-Yang Su and Tzu-Yun Shann and Yi-Hsiang Chang and Hsuan-Kung Yang and Brian Hsi-Lin Ho and Chih-Chieh Tu and Yueh-Chuan Chang and Tsu-Ching Hsiao and Hsin-Wei Hsiao and Sih-Pin Lai and Chun-Yi Lee},
  booktitle={IJCAI},
  year={2018}
}
Collecting training data from the physical world is usually time-consuming and even dangerous for fragile robots, and thus, recent advances in robot learning advocate the use of simulators as the training platform. Unfortunately, the reality gap between synthetic and real visual data prohibits direct migration of the models trained in virtual worlds to the real world. This paper proposes a modular architecture for tackling the virtual-to-real problem. The proposed architecture separates the… Expand
Visual Representations for Semantic Target Driven Navigation
TLDR
This work proposes to use semantic segmentation and detection masks as observations obtained by state-of-the-art computer vision algorithms and use a deep network to learn navigation policies on top of representations that capture spatial layout and semantic contextual cues. Expand
Learning to Navigate from Simulation via Spatial and Semantic Information Synthesis
TLDR
A visual information pyramid (VIP) model is proposed to systematically investigate a practical environment representation and it is suggested that this representation behaves best in both simulated and real-world scenarios. Expand
VIVID: Virtual Environment for Visual Deep Learning
TLDR
A new Virtual Environment for Visual Deep Learning (VIVID) is presented, which offers large-scale diversified indoor and outdoor scenes and leverages the advanced human skeleton system, which enables us to simulate numerous complex human actions. Expand
Semantic Tracklets: An Object-Centric Representation for Visual Multi-Agent Reinforcement Learning
TLDR
This work explicitly construct an objectcentric intermediate representation to characterize the states of an environment, which is referred to as ‘semantic tracklets’ and is the first to successfully learn a strategy for five players in the GFootball environment using only visual data. Expand
Learning to Drive from Simulation without Real World Labels
TLDR
This work presents a method for transferring a vision-based lane following driving policy from simulation to operation on a rural road without any real-world labels and assesses the driving performance using both open-loop regression metrics, and closed-loop performance operating an autonomous vehicle on rural and urban roads. Expand
Grasping Unknown Objects by Coupling Deep Reinforcement Learning, Generative Adversarial Networks, and Visual Servoing
TLDR
A novel approach for transferring a deep reinforcement learning (DRL) grasping agent from simulation to a real robot, without fine tuning in the real world, using a CycleGAN to close the reality gap between the simulated and real environments. Expand
CRAVES: Controlling Robotic Arm With a Vision-Based Economic System
TLDR
This work designs a semi-supervised approach to low-cost arms on which no sensors are equipped and thus all decisions are made upon visual recognition, e.g., real-time 3D pose estimation and applies it to real-world images after domain adaptation. Expand
Reward-driven U-Net training for obstacle avoidance drone
TLDR
This study proposes a new framework where a supervised segmentation network is trained with labels made by an actor-critic network in a reward-driven manner, wherein this U-Net based network infers the next moving direction from the sequence of input images. Expand
Zero-shot Sim-to-Real Transfer with Modular Priors
TLDR
This work proposes a novel framework that effectively decouples RL for high-level decision making from low-level perception and control, and shows that this method can learn effective policies within mere minutes of highly simplified simulation. Expand
Towards Accurate Task Accomplishment with Low-Cost Robotic Arms
TLDR
This work designs a semi-supervised approach to low-cost arms on which no sensors are equipped and thus all decisions are made upon visual recognition, e.g., real-time 3D pose estimation, and applies an iterative algorithm for optimization. Expand
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 55 REFERENCES
Sim-to-Real Robot Learning from Pixels with Progressive Nets
TLDR
This work proposes using progressive networks to bridge the reality gap and transfer learned policies from simulation to the real world, and presents an early demonstration of this approach with a number of experiments in the domain of robot manipulation that focus on bridging thereality gap. Expand
Target-driven visual navigation in indoor scenes using deep reinforcement learning
TLDR
This paper proposes an actor-critic model whose policy is a function of the goal as well as the current state, which allows better generalization and proposes the AI2-THOR framework, which provides an environment with high-quality 3D scenes and a physics engine. Expand
Sim-to-Real Transfer of Accurate Grasping with Eye-In-Hand Observations and Continuous Control
TLDR
This work decomposes the end-to-end system into a vision module and a closed-loop controller module that is robust with respect to discrepancies between the dynamic model of the simulated and real robot, and achieves a 90% success rate in grasping a tiny sphere with a real robot. Expand
Domain randomization for transferring deep neural networks from simulation to the real world
TLDR
This paper explores domain randomization, a simple technique for training models on simulated images that transfer to real images by randomizing rendering in the simulator, and achieves the first successful transfer of a deep neural network trained only on simulated RGB images to the real world for the purpose of robotic control. Expand
(CAD)$^2$RL: Real Single-Image Flight without a Single Real Image
TLDR
This paper proposes a learning method that they call CAD$^2$RL, which can be used to perform collision-free indoor flight in the real world while being trained entirely on 3D CAD models, and shows that it can train a policy that generalizes to thereal world, without requiring the simulator to be particularly realistic or high-fidelity. Expand
Deep visual foresight for planning robot motion
TLDR
This work develops a method for combining deep action-conditioned video prediction models with model-predictive control that uses entirely unlabeled training data and enables a real robot to perform nonprehensile manipulation — pushing objects — and can handle novel objects not seen during training. Expand
Adversarial discriminative sim-to-real transfer of visuo-motor policies
TLDR
An adversarial discriminative sim-to-real transfer approach to reduce the amount of labeled real data required in visuo-motor policies transferred to real environments, achieving a 97.8% success rate and 1.8 cm control accuracy. Expand
Cognitive Mapping and Planning for Visual Navigation
TLDR
The Cognitive Mapper and Planner is based on a unified joint architecture for mapping and planning, such that the mapping is driven by the needs of the task, and a spatial memory with the ability to plan given an incomplete set of observations about the world. Expand
3D Simulation for Robot Arm Control with Deep Q-Learning
TLDR
This work presents an approach which uses 3D simulations to train a 7-DOF robotic arm in a control task without any prior knowledge, and presents preliminary results in direct transfer of policies over to a real robot, without any further training. Expand
ENet: A Deep Neural Network Architecture for Real-Time Semantic Segmentation
TLDR
A novel deep neural network architecture named ENet (efficient neural network), created specifically for tasks requiring low latency operation, which is up to 18 times faster, requires 75% less FLOPs, has 79% less parameters, and provides similar or better accuracy to existing models. Expand
...
1
2
3
4
5
...