• Corpus ID: 14287250

Probabilistic inference as a model of planned behavior

  title={Probabilistic inference as a model of planned behavior},
  author={Marc Toussaint},
  journal={K{\"u}nstliche Intell.},
The problem of planning and goal-directed behavior has been addressed in computer science for many years, typically based on classical concepts like Bellman’s optimality principle, dynamic programming, or Reinforcement Learning methods – but is this the only way to address the problem? Recently there is growing interest in using probabilistic inference methods for decision making and planning. Promising about such approaches is that they naturally extend to distributed state representations and… 

Figures from this paper

A Unified View of Algorithms for Path Planning Using Probabilistic Inference on Factor Graphs

This work starts by posing the path planning problem on a probabilistic factor graph, and shows how the various algorithms translate into specific message composition rules, and provides a very general framework that includes the Sum- product, the Max-product, Dynamic programming and mixed Reward/Entropy criteriabased algorithms.

Successor Representation Active Inference

It is demonstrated that active inference successor representations have significant advantages over current active inference agents in terms of planning horizon and computational cost and how the successor representation agent can generalize to changing reward functions such as variants of the expected free energy.

Planning and exploration in stochastic relational worlds

This thesis addresses planning and exploration in so called stochastic relational worlds which are characterized by two key attributes: they contain large numbers of objects whose properties and relationships can be manipulated, and the effects of actions are uncertain.

VIREL: A Variational Inference Framework for Reinforcement Learning

VIREL is proposed, a novel, theoretically grounded probabilistic inference framework for RL that utilises a parametrised action-value function to summarise future dynamics of the underlying MDP and it is shown that the actor-critic algorithm can be reduced to expectation-maximisation, with policy improvement equivalent to an E-step and policy evaluation to an M-step.

Solving Relational and First-Order Logical Markov Decision Processes: A Survey

This chapter surveys representations and techniques for Markov decision processes, reinforcement learning, and dynamic programming in worlds explicitly modeled in terms of objects and relations and discusses model-free – both value-based and policy-based – and model-based dynamic programming techniques.

A Multitask Representation Using Reusable Local Policy Templates

An approach to solving the multitask problem through decomposing the domain into a set of capabilities based on local contexts, which resemble the options of hierarchical reinforcement learning, but provide robust behaviours capable of achieving some subgoal with the associated guarantee of achieving at least a particular aspiration level of performance.

Problem Solving as Probabilistic Inference with Subgoaling: Explaining Human Successes and Pitfalls in the Tower of Hanoi

This study suggests that a probabilistic inference scheme enhanced with subgoals provides a comprehensive framework to study problem solving and its deficits.

Path Planning Using Probability Tensor Flows

Tensor messages in the state-action space, propagated bi-directionally on a Markov chain, provide crucial information to guide the agent's decisions to model agent's motion in potentially complex grids that include goals and obstacles.

Dynamic Movement Primitives ( DMPs ) encode a desired movement trajectory in terms of the attractor

This work showcases how DMPs can be reformulated as a probabilistic linear dynamical system with control inputs, and shows how inference allows us to measure the likelihood that the authors are successfully executing a given motion primitive.

A Sufficient Statistic for Influence in Structured Multiagent Environments

This paper formalizes influence-based abstraction (IBA), which facilitates the elimination of latent state factors without any loss in value, for a very general class of problems described as factored partially observable stochastic games (fPOSGs).



Probabilistic inference for solving discrete and continuous state Markov Decision Processes

An Expectation Maximization algorithm for computing optimal policies that actually optimizes the discounted expected future return for arbitrary reward functions and without assuming an ad hoc finite total time is presented.

Probabilistic inference for solving (PO) MDPs

The approach is based on an equivalence between maximization of the expected future return in the time-unlimited MDP and likelihood maximization in a related mixture of finite-time MDPs, which allows to use expectation maximization (EM) for computing optimal policies, using arbitrary inference techniques in the E-step.

Planning by Probabilistic Inference

A new approach is presented to the problem of planning under uncertainty in a probabilistic generative model involving actions and states, and the toolbox of inference techniques are brought to bear on the planning problem.

Goal-Based Imitation as Probabilistic Inference over Graphical Models

This paper shows that the problem of goal-based imitation can be formulated as one of inferring goals and selecting actions using a learned probabilistic graphical model of the environment, and describes algorithms for planning actions to achieve a goal state using Probabilistic inference.

Decision-Theoretic Planning: Structural Assumptions and Computational Leverage

This paper presents an overview and synthesis of MDP-related methods, showing how they provide a unifying framework for modeling many classes of planning problems studied in AI, and describes structural properties of M DPs that, when exhibited by particular classes of problems, can be exploited in the construction of optimal or approximately optimal policies or plans.

Probabilistic inference for structured planning in robotics

A new approach to planning in robotics based on probabilistic inference is proposed that uses structured Dynamic Bayesian Networks to represent the scenario and efficient inference techniques (loopy belief propagation) to solve planning problems.

Anytime Point-Based Approximations for Large POMDPs

The point selection procedure is combined with point-based value backups to form an effective anytime POMDP algorithm called Point-Based Value Iteration (PBVI), and a theoretical analysis justifying the choice of belief selection technique is presented.

Policy Recognition in the Abstract Hidden Markov Model

This paper introduces the Abstract Hidden Markov Model (AHMM), a novel type of stochastic processes, provide its dynamic Bayesian network (DBN) structure and analyse the properties of this network, and proposes a novel plan recognition framework based on the AHMM as the plan execution model.

Approximate inference for planning in stochastic relational worlds

This work proposes to convert learned noisy probabilistic relational rules into a structured dynamic Bayesian network representation and evaluates the effectiveness of this approach for online planning in a 3D simulated blocksworld with an articulated manipulator and realistic physics.

Hierarchical POMDP Controller Optimization by Likelihood Maximization

This paper first transforms the problem into a dynamic Bayesian network through which a hierarchical structure can naturally be discovered while optimizing the policy, and demonstrates that this approach scales better than previous techniques based on non-convex optimization.