#### Filter Results:

- Full text PDF available (294)

#### Publication Year

1975

2017

- This year (4)
- Last 5 years (43)
- Last 10 years (116)

#### Publication Type

#### Co-author

#### Journals and Conferences

#### Key Phrases

Learn More

- Dimitri P. Bertsekas, John N. Tsitsiklis
- Optimization and neural computation series
- 1996

- Christos H. Papadimitriou, John N. Tsitsiklis
- Math. Oper. Res.
- 1987

We present serial and parallel algorithms for solving a system of equations that arises from the discretization of the Hamilton-Jacobi equation associated to a trajectory optimization problem of the following type. A vehicle starts at a prespecified point zo and follows a unit speed trajectory ~ ( t ) inside a region in P. until an unspecified time T that… (More)

We discuss an old distributed algorithm for reaching consensus that has received a fair amount of recent attention. In this algorithm, a number of agents exchange their values asynchronously and form weighted averages with (possibly outdated) values possessed by their neighbors. We overview existing convergence results, and establish some new ones, for the… (More)

- Vijay R. Konda, John N. Tsitsiklis
- NIPS
- 1999

We propose and analyze a class of actor-critic algorithms for simulation-based optimization of a Markov decision process over a parameterized family of randomized stationary policies. These are two-time-scale algorithms in which the critic uses TD learning with a linear approximation architecture and the actor is updated in an approximate gradient direction… (More)

We discuss the temporal-difference learning algorithm, as applied to approximating the cost-to-go function of an infinite-horizon discounted Markov chain. The algorithm we analyze updates parameters of a linear function approximator online during a single endless trajectory of an irreducible aperiodic Markov chain with a finite or infinite state space. We… (More)

- John N. Tsitsiklis
- Machine Learning
- 1994

We provide some general results on the convergence of a class of stochastic approximation algorithms and their parallel and asynchronous variants. We then use these results to study the Q-learning algorithm, a reinforcement learning method for solving Markov decision problems, and establish its convergence under conditions more general than previously… (More)