Vladislav Tadic

Learn More
The asymptotic properties of temporal-difference learning algorithms with linear function approximation are analyzed in this paper. The analysis is carried out in the context of the approximation of a discounted cost-to-go function associated with an uncontrolled Markov chain with an uncountable finite-dimensional state-space. Under mild conditions, the(More)
The in vivo effects of sodium cyanide and its antidotes, sodium nitrite, sodium thiosulfate and 4-dimethylaminophenol (DMAP), as well as the alpha-adrenergic blocking agent phentolamine, on rat brain cytochrome oxidase were studied. The course of inhibition was time-dependent and a peak of 40% was attained between 15 and 20 min after the s.c. injection of(More)
The asymptotic properties of temporal-difference learning algorithms with linear function approximation are analyzed in this paper. The analysis is carried out in the context of the approximation of a discounted cost-to-go function associated to an uncontrolled Markov chain with an uncountable finite-dimensional state-space. Under very mild conditions, the(More)
The mean-square asymptotic behavior of constant stepsize temporal-difference algorithms is analyzed in this paper. The analysis is carried out for the case of a linear (cost-to-go) function approximation and for the case of Markov chains with an uncountable state space. An asymptotic upper bound for the mean-square deviation of the algorithm iterations from(More)
  • 1