• Publications
  • Influence
Maximum Entropy Inverse Reinforcement Learning
A probabilistic approach based on the principle of maximum entropy that provides a well-defined, globally normalized distribution over decision sequences, while providing the same performance guarantees as existing methods is developed. Expand
A Reduction of Imitation Learning and Structured Prediction to No-Regret Online Learning
This paper proposes a new iterative algorithm, which trains a stationary deterministic policy, that can be seen as a no regret algorithm in an online learning setting and demonstrates that this new approach outperforms previous approaches on two challenging imitation learning problems and a benchmark sequence labeling problem. Expand
CHOMP: Gradient optimization techniques for efficient motion planning
This paper presents CHOMP, a novel method for continuous path refinement that uses covariant gradient techniques to improve the quality of sampled trajectories and relax the collision-free feasibility prerequisite on input paths required by those strategies. Expand
Maximum margin planning
This work learns mappings from features to cost so an optimal policy in an MDP with these cost mimics the expert's behavior, and demonstrates a simple, provably efficient approach to structured maximum margin learning, based on the subgradient method, that leverages existing fast algorithms for inference. Expand
Reinforcement learning in robotics: A survey
This article attempts to strengthen the links between the two research communities by providing a survey of work in reinforcement learning for behavior generation in robots by highlighting both key challenges in robot reinforcement learning as well as notable successes. Expand
CHOMP: Covariant Hamiltonian optimization for motion planning
In this paper, we present CHOMP (covariant Hamiltonian optimization for motion planning), a method for trajectory optimization invariant to reparametrization. CHOMP uses functional gradientExpand
Autonomous Driving in Urban Environments: Boss and the Urban Challenge
This dissertation aims to provide a history of web exceptionalism from 1989 to 2002, a period chosen in order to explore its roots as well as specific cases up to and including the year in which descriptions of “Web 2.0” began to circulate. Expand
Modeling purposeful adaptive behavior with the principle of maximum causal entropy
The principle of maximum causal entropy is introduced, a general technique for applying information theory to decision-theoretic, game-the theoretical, and control settings where relevant information is sequentially revealed over time. Expand
Activity Forecasting
The unified model uses state-of-the-art semantic scene understanding combined with ideas from optimal control theory to achieve accurate activity forecasting and shows how the same techniques can improve the results of tracking algorithms by leveraging information about likely goals and trajectories. Expand
Efficient Reductions for Imitation Learning
This work proposes two alternative algorithms for imitation learning where training occurs over several episodes of interaction and shows that this leads to stronger performance guarantees and improved performance on two challenging problems: training a learner to play a 3D racing game and Mario Bros. Expand