We consider a class of discrete-time stochastic control systems, with Borel state and action spaces, and possibly unbounded costs. The processes evolve according to the equation x t+1 = F (x t , a t , ξ t), t = 0, 1,. .. , where the ξ t are i.i.d. random vectors whose common distribution is unknown. Assuming observability of {ξ t }, we use the empirical… (More)

