Share This Author
PRECOG: PREdiction Conditioned on Goals in Visual Multi-Agent Settings
- Nicholas Rhinehart, Rowan McAllister, Kris Kitani, S. Levine
- Computer ScienceIEEE/CVF International Conference on Computer…
- 3 May 2019
A probabilistic forecasting model of future interactions between a variable number of agents that performs both standard forecasting and the novel task of conditional forecasting, which reasons about how all agents will likely respond to the goal of a controlled agent.
r2p2: A ReparameteRized Pushforward Policy for Diverse, Precise Generative Path Forecasting
A method to forecast a vehicle’s ego-motion as a distribution over spatiotemporal paths, conditioned on features embedded in an overhead map, and obtains expressions for the cross-entropy metrics that can be efficiently evaluated and differentiated, enabling stochastic-gradient optimization.
First-Person Activity Forecasting with Online Inverse Reinforcement Learning
- Nicholas Rhinehart, Kris M. Kitani
- Computer ScienceIEEE International Conference on Computer Vision…
- 22 December 2016
DARKO learns and forecasts from first-person visual observations of the user’s daily behaviors via an Online Inverse Reinforcement Learning (IRL) approach, and shows DARKO forecasts goals better than competing methods in both noisy and ideal settings.
N2N Learning: Network to Network Compression via Policy Gradient Reinforcement Learning
This paper introduces a principled method for learning reduced network architectures in a data-driven way using reinforcement learning and can achieve compression rates of more than 10x for models such as ResNet-34 while maintaining similar performance to the input `teacher' network.
Directed-Info GAIL: Learning Hierarchical Policies from Unsegmented Demonstrations using Directed Information
This work discovers the interaction between sub-tasks from their resulting state-action trajectory sequences using a directed graphical model and proposes a new algorithm based on the generative adversarial imitation learning framework which automatically learns sub-task policies from unsegmented demonstrations.
Inverting the Pose Forecasting Pipeline with SPF2: Sequential Pointcloud Forecasting for Sequential Pose Forecasting
- Xinshuo Weng, Jianren Wang, S. Levine, Kris Kitani, Nicholas Rhinehart
- Computer ScienceCoRL
- 9 November 2020
This work proposes to first forecast 3D sensor data and then detect/track objects on the predicted point cloud sequences to obtain future poses, i.e., a forecast-then-detect pipeline, and shows that SPFNet is effective for the SPF task, and that pose forecasting performance improves with the addition of unlabeled data.
Deep Imitative Models for Flexible Inference, Planning, and Control
This paper proposes Imitative Models, probabilistic predictive models of desirable behavior able to plan interpretable expert-like trajectories to achieve specified goals, and derives families of flexible goal objectives that can be used to successfully direct behavior.
Parrot: Data-Driven Behavioral Priors for Reinforcement Learning
- Avi Singh, Huihan Liu, G. Zhou, Albert Yu, Nicholas Rhinehart, S. Levine
- Computer ScienceICLR
- 19 November 2020
This paper proposes a method for pre-training behavioral priors that can capture complex input-output relationships observed in successful trials from a wide range of previously seen tasks, and shows how this learned prior can be used for rapidly learning new tasks without impeding the RL agent's ability to try out novel behaviors.
Conservative Safety Critics for Exploration
- Homanga Bharadhwaj, Aviral Kumar, Nicholas Rhinehart, S. Levine, F. Shkurti, Animesh Garg
- Computer ScienceICLR
- 27 October 2020
This paper theoretically characterize the tradeoff between safety and policy improvement, show that the safety constraints are likely to be satisfied with high probability during training, derive provable convergence guarantees for the approach, and demonstrate the efficacy of the proposed approach on a suite of challenging navigation, manipulation, and locomotion tasks.
Can Autonomous Vehicles Identify, Recover From, and Adapt to Distribution Shifts?
- Angelos Filos, P. Tigas, Rowan McAllister, Nicholas Rhinehart, S. Levine, Y. Gal
- Computer ScienceICML
- 26 June 2020
An epistemic uncertainty-aware planning method, called robust imitative planning (RIP), that can detect and recover from some distribution shifts, reducing the overconfident and catastrophic extrapolations in OOD scenes.