## One Citation

On the Linear Convergence of Natural Policy Gradient Algorithm

- Computer Science2021 60th IEEE Conference on Decision and Control (CDC)
- 2021

Improved finite time convergence bounds are presented, and it is shown that the Natural Policy Gradient, which forms the basis of several popular RL algorithms, has geometric (also known as linear) asymptotic convergence rate.

## References

SHOWING 1-10 OF 351 REFERENCES

Essentials of Stochastic Processes

- Mathematics
- 2006

Basic concepts Additive processes Stationary processes Markov processes Diffusion Postscript.

Theory of point estimation

- Mathematics
- 1950

This paper presents a meta-analyses of large-sample theory and its applications in the context of discrete-time reinforcement learning, which aims to clarify the role of reinforcement learning in the reinforcement-gauging process.

Efficient and Adaptive Estimation for Semiparametric Models

- Mathematics
- 1993

Introduction.- Asymptotic Inference for (Finite-Dimensional) Parametric Models.- Information Bounds for Euclidean Parameters in Infinite-Dimensional Models.- Euclidean Parameters: Further Examples.-…

Computational Implications of Reducing Data to Sufficient Statistics

- Computer ScienceArXiv
- 2014

It is shown that reducing data to sufficient statistics can change a computationally tractable estimation problem into an intractable one.

Structure adaptive approach for dimension reduction

- Mathematics
- 2001

We propose a new method of effective dimension reduction for a multiindex model which is based on iterative improvement of the family of average derivative estimates. The procedure is computationally…

Semiparametric and Nonparametric Methods in Econometrics

- Mathematics, Economics
- 2007

Single-Index Models.- Nonparametric Additive Models and Semiparametric Partially Linear Models.- Binary-Response Models.- Statistical Inverse Problems.- Transformation Models.

Mathematical Foundations of Infinite-Dimensional Statistical Models

- Mathematics, Computer Science
- 2015

This chapter discusses nonparametric statistical models, function spaces and approximation theory, and the minimax paradigm, which aims to provide a model for adaptive inference oflihood-based procedures.

Acceleration of stochastic approximation by averaging

- Computer Science
- 1992

Convergence with probability one is proved for a variety of classical optimization and identification problems and it is demonstrated for these problems that the proposed algorithm achieves the highest possible rate of convergence.

High-Dimensional Probability: An Introduction with Applications in Data Science

- Mathematics
- 2020

© 2018, Cambridge University Press Let us summarize our findings. A random projection of a set T in R n onto an m-dimensional subspace approximately preserves the geometry of T if m ⪆ d ( T ) . For...

Optimal detection of sparse principal components in high dimension

- Computer Science
- 2012

The minimax optimal test is based on a sparse eigenvalue statistic, and a computationally efficient alternative test using convex relaxations is described, which is proved to detect sparse principal components at near optimal detection levels and performs well on simulated datasets.