John N. Tsitsiklis

Learn More
We present serial and parallel algorithms for solving a system of equations that arises from the discretization of the Hamilton-Jacobi equation associated to a trajectory optimization problem of the following type. A vehicle starts at a prespecified point zo and follows a unit speed trajectory ~ ( t ) inside a region in P. until an unspecified time T that(More)
We discuss an old distributed algorithm for reaching consensus that has received a fair amount of recent attention. In this algorithm, a number of agents exchange their values asynchronously and form weighted averages with (possibly outdated) values possessed by their neighbors. We overview existing convergence results, and establish some new ones, for the(More)
We propose and analyze a class of actor-critic algorithms for simulation-based optimization of a Markov decision process over a parameterized family of randomized stationary policies. These are two-time-scale algorithms in which the critic uses TD learning with a linear approximation architecture and the actor is updated in an approximate gradient direction(More)
We discuss the temporal-difference learning algorithm, as applied to approximating the cost-to-go function of an infinite-horizon discounted Markov chain. The algorithm we analyze updates parameters of a linear function approximator online during a single endless trajectory of an irreducible aperiodic Markov chain with a finite or infinite state space. We(More)