• Publications
  • Influence
S4RL: Surprisingly Simple Self-Supervision for Offline Reinforcement Learning
This work proposes a Surprisingly Simple Self-Supervision algorithm (S4RL), which utilizes data augmentations from states to learn value functions that are better at generalizing and extrapolating when deployed in the environment. Expand
Learning latent actions to control assistive robots
This work can achieve intuitive, user-friendly control of assistive robots by embedding the robot’s high-dimensional actions into low-dimensional and human-controllable latent actions in a personalized alignment model between joystick inputs and latent actions. Expand
Articulated Object Interaction in Unknown Scenes with Whole-Body Mobile Manipulation
This paper proposes a two-stage architecture for autonomous interaction with large articulated objects in unknown environments and demonstrates that the proposed approach achieves better performance than commonly used control methods in mobile manipulation. Expand
Diversity inducing Information Bottleneck in Model Ensembles
Although deep learning models have achieved state-of-the art performance on a number of vision tasks, generalization over high dimensional multi-modal data, and reliable predictive uncertaintyExpand
Value Iteration in Continuous Actions, States and Time
The cFVI algorithm enables dynamic programming for continuous states and actions with a known dynamics model and is more robust to changes in the dynamics despite using only a deterministic model and without explicitly incorporating robustness in the optimization. Expand
Tesseract: Tensorised Actors for Multi-Agent Reinforcement Learning
This work considers the fundamental hurdle affecting both valuebased and policy-gradient approaches: an exponential blowup of the action space with the number of agents and proposes a novel tensorised formulation of the Bellman equation, which gives rise to TESSERACT, which views the Q-function as a tensor whose modes correspond to the action spaces of different agents. Expand
GLiDE: Generalizable Quadrupedal Locomotion in Diverse Environments with a Centroidal Model
This work explores how RL can be effectively used with a centroidal model to generate robust control policies for quadrupedal locomotion and shows the potential of the method by demonstrating stepping-stone locomotion, twolegged in-place balance, balance beam locomotion; and sim-toreal transfer without further adaptations. Expand
Transferring Dexterous Manipulation from GPU Simulation to a Remote Real-World TriFinger
We present a system for learning a challenging dexterous manipulation task involving moving a cube to an arbitrary 6-DoF pose with only 3-fingers trained with NVIDIA’s IsaacGym simulator. We showExpand
LASER: Learning a Latent Action Space for Efficient Reinforcement Learning
LASER is trained as a variational encoder-decoder model to map raw actions into a disentangled latent action space while maintaining action reconstruction and latent space dynamic consistency. Expand
Robust Value Iteration for Continuous Control Tasks
This paper uses dynamic programming to compute the optimal value function on the compact state domain and incorporates adversarial perturbations of the system dynamics and shows that robust value iteration is more robust compared to deep reinforcement learning algorithm and the non-robust version of the algorithm. Expand