• Publications
  • Influence
A Reduction of Imitation Learning and Structured Prediction to No-Regret Online Learning
TLDR
In this paper, we propose a new iterative algorithm, which trains a stationary deterministic policy, that can be seen as a no regret algorithm in an online learning setting. Expand
  • 1,445
  • 274
  • PDF
Relational learning via collective matrix factorization
TLDR
We propose a collective matrix factorization model: we simultaneously factor several matrices, sharing parameters among factors when an entity participates in multiple relations. Expand
  • 919
  • 141
  • PDF
Stable Function Approximation in Dynamic Programming
TLDR
We provide a proof of convergence for a wide class of temporal difference methods involving function approximators such as k-nearest-neighbor, and show experimentally that these methods can be useful. Expand
  • 570
  • 59
  • PDF
ARA*: Anytime A* with Provable Bounds on Sub-Optimality
TLDR
We propose an anytime heuristic search, ARA*, which tunes its performance bound based on available search time. Expand
  • 631
  • 58
  • PDF
Anytime Point-Based Approximations for Large POMDPs
TLDR
The Partially Observable Markov Decision Process has long been recognized as a rich framework for real-world planning and control problems. Expand
  • 349
  • 38
  • PDF
Automatic Database Management System Tuning Through Large-scale Machine Learning
TLDR
We use a combination of supervised and unsupervised machine learning methods to select the most impactful knobs, (2) map unseen database workloads to previous workloads from which we can transfer experience, and (3) recommend knob settings. Expand
  • 228
  • 29
  • PDF
Anytime Dynamic A*: An Anytime, Replanning Algorithm
TLDR
We present a graph-based planning and replanning algorithm able to produce bounded suboptimal solutions in an anytime fashion. Expand
  • 553
  • 27
  • PDF
Decentralized estimation and control of graph connectivity in mobile sensor networks
TLDR
The ability of a robot team to reconfigure itself is useful in many applications: for metamorphic robots to change shape, for swarm motion towards a goal, for biological systems to avoid predators, or for mobile buoys to clean up oil spills. Expand
  • 184
  • 25
Adversarial Multiple Source Domain Adaptation
TLDR
We propose multisource domain adversarial networks (MDAN) that approach domain adaptation by optimizing task-adaptive generalization bounds. Expand
  • 141
  • 25
  • PDF
Bounded real-time dynamic programming: RTDP with monotone upper bounds and performance guarantees
TLDR
We introduce a new algorithm, Bounded RTDP, which produces partial policies with strong performance guarantees while only touching a fraction of the state space, even on problems where other algorithms would have to visit the full state space. Expand
  • 150
  • 24
  • PDF
...
1
2
3
4
5
...