- Dimitri P. Bertsekas, John N. Tsitsiklis
- Optimization and neural computation series
- 1996

We present serial and parallel algorithms for solving a system of equations that arises from the discretization of the Hamilton-Jacobi equation associated to a trajectory optimization problem of the following type. A vehicle starts at a prespecified point zo and follows a unit speed trajectory ~ (t) inside a region in P. until an unspecified time T that theā¦ (More)

- Christos H. Papadimitriou, John N. Tsitsiklis
- Math. Oper. Res.
- 1987

- Vijay R. Konda, John N. Tsitsiklis
- NIPS
- 1999

We propose and analyze a class of actor-critic algorithms for simulation-based optimization of a Markov decision process over a parameterized family of randomized stationary policies. These are two-timescale algorithms in which the critic uses TD learning with a linear approximation architecture and the actor is updated in an approximate gradient directionā¦ (More)

- John N. Tsitsiklis
- Machine Learning
- 1994

We provide some general results on the convergence of a class of stochastic approximation algorithms and their parallel and asynchronous variants. We then use these results to study the Q-learning algorithm, a reinforcement learning method for solving Markov decision problems, and establish its convergence under conditions more general than previouslyā¦ (More)

- Ramesh Johari, John N. Tsitsiklis
- Math. Oper. Res.
- 2004

We explore the properties of a congestion game where users of a congested resource anticipate the effect of their actions on the price of the resource. When users are sharing a single resource, we establish that the aggregate utility received by the users is at least 3/4 of the maximum possible aggregate utility. We also consider extensions to a networkā¦ (More)

- John N. Tsitsiklis
- IEEE Trans. Communications
- 1993