• Publications
  • Influence
Global Convergence of Policy Gradient Methods to (Almost) Locally Optimal Policies
TLDR
We propose a new variant of PG methods for infinite-horizon problems that uses a random rollout horizon for the Monte-Carlo estimation of the policy gradient. Expand
  • 45
  • 11
  • PDF
A Saddle Point Algorithm for Networked Online Convex Optimization
TLDR
An algorithm to learn optimal actions in convex distributed online problems is proposed to control the growth of global network regret. Expand
  • 89
  • 8
  • PDF
On the Sample Complexity of Actor-Critic Method for Reinforcement Learning with Function Approximation
TLDR
Reinforcement learning, mathematically described by Markov Decision Problems, may be approached either through dynamic programming or policy search. Expand
  • 15
  • 6
  • PDF
Proximity Without Consensus in Online Multiagent Optimization
TLDR
We consider stochastic optimization problems in multiagent settings, where a network of agents aims to learn parameters that are optimal in terms of a global convex objective, while giving preference to locally observed information. Expand
  • 44
  • 4
  • PDF
D4L: Decentralized Dynamic Discriminative Dictionary Learning
TLDR
We consider discriminative dictionary learning in a distributed online setting, where a network of agents aims to learn, from sequential observations, statistical model parameters jointly with data-driven signal representations. Expand
  • 36
  • 4
  • PDF
Proximity without consensus in online multi-agent optimization
TLDR
We consider stochastic optimization problems in multi-agent settings where a network of agents aims to learn decision variables which are optimal in terms of a global objective, while giving preference to locally and sequentially observed information. Expand
  • 35
  • 4
  • PDF
Decentralized Prediction-Correction Methods for Networked Time-Varying Convex Optimization
TLDR
We develop algorithms that find and track the optimal solution trajectory of time-varying convex optimization problems that consist of local and network-related objectives. Expand
  • 39
  • 3
  • PDF
Decentralized Online Learning With Kernels
TLDR
We consider multiagent stochastic optimization problems over reproducing kernel Hilbert spaces. Expand
  • 26
  • 3
  • PDF
A Class of Prediction-Correction Methods for Time-Varying Convex Optimization
TLDR
We propose algorithms with a discrete time-sampling scheme to find and track the solution trajectory based on prediction and correction steps, while sampling the problem data at a constant rate of 1/h. Expand
  • 69
  • 2
  • PDF
Parsimonious Online Learning with Kernels via sparse projections in function space
TLDR
We consider stochastic nonparametric regression problems in a reproducing kernel Hilbert space (RKHS), an extension of expected risk minimization to nonlinear function estimation. Expand
  • 48
  • 2
  • PDF