Learn More
Markov decision processes (MDPs) offer a popular mathematical tool for planning and learning in the presence of uncertainty (Boutilier, Dean, & Hanks 1999). MDPs are a standard formalism for describing multi-stage decision making in probabilistic environments. The objective of the decision making is to maximize a cumulative measure of long-term performance,(More)
We present metrics for measuring state similarity in Markov decision processes (MDPs) with infinitely many states, including MDPs with continuous state spaces. Such metrics provide a stable quantitative analogue of the notion of bisimulation for MDPs, and are suitable for use in MDP approximation. We show that the optimal value function associated with a(More)
In recent years, various metrics have been developed for measuring the behavioural similarity of states in probabilistic transition systems [Desharnais et al. In the context of finite Markov decision processes, we have built on these metrics to provide a robust quantitative analogue of stochastic bisimulation [Ferns et al., In this paper, we seek to(More)
A popular approach to solving large probabilis-tic systems relies on aggregating states based on a measure of similarity. Many approaches in the literature are heuristic. A number of recent methods rely instead on metrics based on the notion of bisimulation, or behavioral equivalence between states (Givan et al., 2003; Ferns et al., 2004). An integral(More)
— Approximation techniques for labelled Markov processes on continuous state spaces were developed by Desharnais, Gupta, Jagadeesan and Panangaden. However, it has not been clear whether this scheme could be used in practice since it involves inverting a stochastic kernel. We describe a Monte-Carlo-based implementation scheme for this approximation(More)
Computational models of neuromotor control require forward models of limb movement that can replicate the natural relationships between muscle activation and joint dynamics without the burdens of excessive anatomical detail. We present a model of a three-link biomechanical limb that emphasizes the dynamics of limb movement within a simplified(More)
We transfer a notion of quantitative bisimilarity for labelled Markov processes [1] to Markov decision processes with continuous state spaces. This notion takes the form of a pseudometric on the system states, cast in terms of the equivalence of a family of functional expressions evaluated on those states and interpreted as a real-valued modal logic. Our(More)
  • 1