Learn More
In the field of sequential decision making and reinforcement learning, it has been observed that good policies for most problems exhibit a significant amount of structure. In practice , this implies that when a learning agent discovers an action is better than any other in a given state, this action actually happens to also dominate in a certain(More)
We introduce the Optimal Sample Selection (OSS) meta-algorithm for solving discrete-time Optimal Control problems. This meta-algorithm maps the problem of finding a near-optimal closed-loop policy to the identification of a small set of one-step system transitions, leading to high-quality policies when used as input of a batch-mode Reinforcement Learning(More)
Recent work on Markov Decision Processes (MDPs) covers the use of continuous variables and resources, including time. This work is usually done in a framework of bounded resources and finite temporal horizon for which a total reward criterion is often appropriate. However, most of this work considers discrete effects on continuous variables while(More)
OATAO is an open access repository that collects the work of Toulouse researchers and makes it freely available over the web where possible. Abstract—The allocation of visual attention is a key factor for the humans when operating complex systems under time pressure with multiple information sources. In some situations, attentional tunneling is likely to(More)
We introduce a new plan repair method for problems cast as Mixed Integer Programs. In order to tackle the inherent complexity of these NP-hard problems, our approach relies on the use of Supervised Learning method for the offline construction of a predictor which takes the problem's parameters as input and infers values for the discrete optimization(More)
We introduce TiMDPpoly , an algorithm designed to solve planning problems with durative actions, under probabilistic uncertainty, in a non-stationary, continuous-time context. Mission planning for autonomous agents such as planetary rovers or unmanned aircrafts often correspond to such time-dependent planning problems. Modeling these problems can be cast(More)
Time is a crucial variable in planning and often requires special attention since it introduces a specific structure along with additional complexity, especially in the case of decision under uncertainty. In this paper, after reviewing and comparing MDP frameworks designed to deal with temporal problems, we focus on Generalized Semi-Markov Decision(More)
  • 1