• Corpus ID: 245502950

The Statistical Complexity of Interactive Decision Making

  title={The Statistical Complexity of Interactive Decision Making},
  author={Dylan J. Foster and Sham M. Kakade and Jian Qian and Alexander Rakhlin},
A fundamental challenge in interactive learning and decision making, ranging from bandit problems to reinforcement learning, is to provide sample-efficient, adaptive learning algorithms that achieve near-optimal regret. This question is analogous to the classical problem of optimal (supervised) statistical learning, where there are well-known complexity measures (e.g., VC dimension and Rademacher complexity) that govern the statistical complexity of learning. However, characterizing the… 

Figures and Tables from this paper

Reward-Free RL is No Harder Than Reward-Aware RL in Linear Markov Decision Processes
This approach is the first computationally efficient algorithm to achieve optimal d dependence in linear MDPs, even in the single-reward PAC setting, and it is shown that this exploration procedure can also be applied to solve the problem of obtaining “well-conditioned” covariates in linearMDPs.
Computational-Statistical Gaps in Reinforcement Learning
This work presents the first computational lower bound for RL with linear function approximation: unless NP=RP, no randomized polynomial time algorithm exists for deterministic transition MDPs with a constant number of actions and linear optimal value functions.
Provable Reinforcement Learning with a Short-Term Memory
This paper proposes to study a new subclass of POMDPs, whose latent states can be decoded by the most recent history of a short length m, and establishes a set of upper and lower bounds on the sample complexity for learning near-optimal policies for this class of problems in both tabular and rich-observation settings.
Non-Linear Reinforcement Learning in Large Action Spaces: Structural Conditions and Sample-efficiency of Posterior Sampling
This work presents the first result for non-linear function approximation which holds for general action spaces under a linear embeddability condition, which generalizes all linear and finite action settings.
Deep Reinforcement Learning: Opportunities and Challenges
In this article, a brief introduction to reinforcement learning (RL), and its relationship with deep learning, machine learning and AI is given, and a discussion is attempted, attempting to answer: “Why has RL not been widely adopted in practice yet?” and “When is RL helpful?’.
Minimax Regret for Partial Monitoring: Infinite Outcomes and Rustichini's Regret
We show that a version of the generalised information ratio of Lattimore and György (2020) determines the asymptotic minimax regret for all finite-action partial monitoring games provided that (a)
Exploiting the Curvature of Feasible Sets for Faster Projection-Free Online Learning
In this paper, we develop new efficient projection-free algorithms for Online Convex Optimization (OCO). Online Gradient Descent (OGD) is an example of a classical OCO algorithm that guarantees the
Pessimism for Offline Linear Contextual Bandits using $\ell_p$ Confidence Sets
We present a family { (cid:98) π p } p ≥ 1 of pessimistic learning rules for offline learning of linear contextual bandits, relying on confidence sets with respect to different (cid:96) p norms, where
Smoothed Online Learning is as Easy as Statistical Learning
A lower bound on the oracles complexity of any proper learning algorithm is proved, which matches the oracle-efficient upper bounds up to a polynomial factor, thus demonstrating the existence of a statistical-computational gap in smooth online learning.
The Complexity of Markov Equilibrium in Stochastic Games
We show that computing approximate stationary Markov coarse correlated equilibria (CCE) in general-sum stochastic games is computationally intractable, even when there are two players, the game is