• Publications
  • Influence
Provably Efficient Reinforcement Learning with Linear Function Approximation
This paper proves that an optimistic modification of Least-Squares Value Iteration (LSVI) achieves regret, where d is the ambient dimension of feature space, H is the length of each episode, and T is the total number of steps, and is independent of the number of states and actions. Expand
A Strictly Contractive Peaceman-Rachford Splitting Method for Convex Programming
This paper focuses on the application of the Peaceman-Rachford splitting method to a convex minimization model with linear constraints and a separable objective function, and suggests attaching an underdetermined relaxation factor with PRSM to guarantee the strict contraction of its iterative sequence and proposes a strictly contractive PRSM. Expand
A Theoretical Analysis of Deep Q-Learning
This work makes the first attempt to theoretically understand the deep Q-network (DQN) algorithm from both algorithmic and statistical perspectives and proposes the Minimax-D QN algorithm for zero-sum Markov game with two players. Expand
These results show that the final estimator attains an oracle statistical property due to the usage of nonconvex penalty, and improves upon existing results by providing a more refined sample complexity bound as well as an exact support recovery result for the final estimation. Expand
Neural Policy Gradient Methods: Global Optimality and Rates of Convergence
This analysis establishes the first global optimality and convergence guarantees for neural policy gradient methods by relating the suboptimality of the stationary points to the representation power of neural actor and critic classes and proving the global optimability of all stationary points under mild regularity conditions. Expand
Multi-Agent Reinforcement Learning via Double Averaging Primal-Dual Optimization
This paper proposes a double averaging scheme, where each agent iteratively performs averaging over both space and time to incorporate neighboring gradient information and local reward information, respectively, and proves that the proposed algorithm converges to the optimal solution at a global geometric rate. Expand
On Tighter Generalization Bound for Deep Neural Networks: CNNs, ResNets, and Beyond
A margin based data dependent generalization error bound is established for a general family of deep neural networks in terms of the depth and width, as well as the Jacobian of the networks, through a new characterization of the Lipschitz properties of neural network family. Expand
Sparse Generalized Eigenvalue Problem: Optimal Statistical Rates via Truncated Rayleigh Flow
Sparse generalized eigenvalue problem (GEP) plays a pivotal role in a large family of high-dimensional statistical models, including sparse Fisher's discriminant analysis, canonical correlationExpand
A Nonconvex Optimization Framework for Low Rank Matrix Estimation
It is proved that a broad class of nonconvex optimization algorithms, including alternating minimization and gradient-type methods, geometrically converge to the global optimum and exactly recover the true low rank matrices under standard conditions. Expand
Symmetry, Saddle Points, and Global Geometry of Nonconvex Matrix Factorization
A general theory for studying the geometry of nonconvex objective functions with underlying symmetric structures is proposed and the locations of stationary points and the null space of the associated Hessian matrices are characterized via the lens of invariant groups. Expand