# The Statistical Complexity of Interactive Decision Making

@article{Foster2021TheSC, title={The Statistical Complexity of Interactive Decision Making}, author={Dylan J. Foster and Sham M. Kakade and Jian Qian and Alexander Rakhlin}, journal={ArXiv}, year={2021}, volume={abs/2112.13487} }

A fundamental challenge in interactive learning and decision making, ranging from bandit problems to reinforcement learning, is to provide sample-efficient, adaptive learning algorithms that achieve near-optimal regret. This question is analogous to the classical problem of optimal (supervised) statistical learning, where there are well-known complexity measures (e.g., VC dimension and Rademacher complexity) that govern the statistical complexity of learning. However, characterizing the…

## 11 Citations

Reward-Free RL is No Harder Than Reward-Aware RL in Linear Markov Decision Processes

- Computer ScienceArXiv
- 2022

This approach is the first computationally efficient algorithm to achieve optimal d dependence in linear MDPs, even in the single-reward PAC setting, and it is shown that this exploration procedure can also be applied to solve the problem of obtaining “well-conditioned” covariates in linearMDPs.

Computational-Statistical Gaps in Reinforcement Learning

- Computer Science, MathematicsArXiv
- 2022

This work presents the first computational lower bound for RL with linear function approximation: unless NP=RP, no randomized polynomial time algorithm exists for deterministic transition MDPs with a constant number of actions and linear optimal value functions.

Provable Reinforcement Learning with a Short-Term Memory

- Computer ScienceArXiv
- 2022

This paper proposes to study a new subclass of POMDPs, whose latent states can be decoded by the most recent history of a short length m, and establishes a set of upper and lower bounds on the sample complexity for learning near-optimal policies for this class of problems in both tabular and rich-observation settings.

Non-Linear Reinforcement Learning in Large Action Spaces: Structural Conditions and Sample-efficiency of Posterior Sampling

- Computer Science, MathematicsArXiv
- 2022

This work presents the first result for non-linear function approximation which holds for general action spaces under a linear embeddability condition, which generalizes all linear and finite action settings.

Deep Reinforcement Learning: Opportunities and Challenges

- Computer ScienceArXiv
- 2022

In this article, a brief introduction to reinforcement learning (RL), and its relationship with deep learning, machine learning and AI is given, and a discussion is attempted, attempting to answer: “Why has RL not been widely adopted in practice yet?” and “When is RL helpful?’.

Minimax Regret for Partial Monitoring: Infinite Outcomes and Rustichini's Regret

- Computer ScienceArXiv
- 2022

We show that a version of the generalised information ratio of Lattimore and György (2020) determines the asymptotic minimax regret for all finite-action partial monitoring games provided that (a)…

Exploiting the Curvature of Feasible Sets for Faster Projection-Free Online Learning

- Computer Science
- 2022

In this paper, we develop new eﬃcient projection-free algorithms for Online Convex Optimization (OCO). Online Gradient Descent (OGD) is an example of a classical OCO algorithm that guarantees the…

Pessimism for Offline Linear Contextual Bandits using $\ell_p$ Confidence Sets

- Computer Science
- 2022

We present a family { (cid:98) π p } p ≥ 1 of pessimistic learning rules for oﬄine learning of linear contextual bandits, relying on conﬁdence sets with respect to diﬀerent (cid:96) p norms, where…

Smoothed Online Learning is as Easy as Statistical Learning

- Computer ScienceArXiv
- 2022

A lower bound on the oracles complexity of any proper learning algorithm is proved, which matches the oracle-efﬁcient upper bounds up to a polynomial factor, thus demonstrating the existence of a statistical-computational gap in smooth online learning.

The Complexity of Markov Equilibrium in Stochastic Games

- Computer ScienceArXiv
- 2022

We show that computing approximate stationary Markov coarse correlated equilibria (CCE) in general-sum stochastic games is computationally intractable, even when there are two players, the game is…