• Publications
  • Influence
Backplay: "Man muss immer umkehren"
TLDR
The approach, Backplay, uses a single demonstration to construct a curriculum for a given task, and analytically characterize the types of environments where Backplay can improve training speed and compare favorably to other competitive methods known to improve sample efficiency.
Multi-Agent Reinforcement Learning: A Report on Challenges and Approaches
TLDR
The report concludes with advances in the paradigm of training Multi-Agent Systems called Decentralized Actor, Centralized Critic, based on an extension of MDPs called \textit{Decentralization Partially Observable MDP}s, which has seen a renewed interest lately.
Variational Auto-Regressive Gaussian Processes for Continual Learning
TLDR
Experiments on standard continual learning benchmarks demonstrate the ability of VAR-GPs to perform well at new tasks without compromising performance on old ones, yielding competitive results to state-of-the-art methods.
First-Order Preconditioning via Hypergradient Descent
TLDR
First-order preconditioning (FOP) is introduced, a fast, scalable approach that generalizes previous work on hypergradient descent and is able to improve the performance of standard deep learning optimizers on visual classification and reinforcement learning tasks with minimal computational overhead.
When are Iterative Gaussian Processes Reliably Accurate?
TLDR
This work investigates CG tolerance, preconditioner rank, and Lanczos decompositions, and shows that L-BFGS-B is a compelling optimizer for Iterative GPs, achieving convergence with fewer gradient updates.
SKIing on Simplices: Kernel Interpolation on the Permutohedral Lattice for Scalable Gaussian Processes
TLDR
This work develops a connection between SKI and the permutohedral lattice used for highdimensional fast bilateral filtering, and provides a CUDA implementation of Simplex-GP, which enables significant GPU acceleration of MVM based inference.
A Simple and Fast Baseline for Tuning Large XGBoost Models
TLDR
It is shown that uniform subsampling makes for a simple yet fast baseline to speed up the tuning of large XGBoost models using multi-fidelity hyperparameter optimization with data subsets as the fidelity dimension.