• Corpus ID: 202661127

Decoupling stochastic optimal control problems for efficient solution: insights from experiments across a wide range of noise regimes

@article{Mohamed2019DecouplingSO,
  title={Decoupling stochastic optimal control problems for efficient solution: insights from experiments across a wide range of noise regimes},
  author={Mohamed Naveed Gul Mohamed and Suman Chakravorty and Dylan A. Shell},
  journal={ArXiv},
  year={2019},
  volume={abs/1909.08585}
}
We consider the problem of robotic planning under uncertainty in this paper. This problem may be posed as a stochastic optimal control problem, a solution to which is fundamentally intractable owing to the infamous "curse of dimensionality". Hence, we consider the extension of a "decoupling principle" that was recently proposed by some of the authors, wherein a nominal open-loop problem is solved followed by a linear feedback design around the open-loop, and which was shown to be near-optimal… 

Figures from this paper

References

SHOWING 1-10 OF 22 REFERENCES
T-PFC: A Trajectory-Optimized Perturbation Feedback Control Approach
TLDR
This letter derives a decoupling principle between the open-loop plan, and the closed-loop feedback gains, which leads to a deterministic perturbation feedback control based solution to fully observable stochastic optimal control problems, that is near-optimal.
Decoupled Data-Based Approach for Learning to Control Nonlinear Dynamical Systems
TLDR
A novel decoupled data-based control (D2C) algorithm is proposed that addresses the problem of learning the optimal control policy for a nonlinear stochastic dynamical system using a decoupling, “open-loop–closed-loop,” approach.
Learning Neural Network Policies with Guided Policy Search under Unknown Dynamics
TLDR
A policy search method that uses iteratively refitted local linear models to optimize trajectory distributions for large, continuous problems and can be used to learn complex neural network policies that successfully execute simulated robotic manipulation tasks in partially observed environments with numerous contact discontinuities and underactuation.
Stochastic Differential Dynamic Programming
TLDR
This work presents a generalization of the classic Differential Dynamic Programming algorithm that assumes the existence of state and control multiplicative process noise, and proceeds to derive the second-order expansion of the cost-to-go.
Model-Free Trajectory Optimization for Reinforcement Learning
TLDR
A new model-free algorithm is proposed that backpropagates a local quadratic time-dependent Q-Function, allowing the derivation of the policy update in closed form, demonstrating improved performance in comparison to related Trajectory Optimization algorithms linearizing the dynamics.
Learning Complex Neural Network Policies with Trajectory Optimization
TLDR
This work introduces a policy search algorithm that can directly learn high-dimensional, general-purpose policies, represented by neural networks, and can learn policies for complex tasks such as bipedal push recovery and walking on uneven terrain, while outperforming prior methods.
Formal models and algorithms for decentralized decision making under uncertainty
TLDR
Five different formal frameworks, three different optimal algorithms, as well as a series of approximation techniques are analyzed to provide interesting insights into the structure of decentralized problems, the expressiveness of the various models, and the relative advantages and limitations of the different solution techniques.
Iterative local dynamic programming
  • E. Todorov, Yuval Tassa
  • Mathematics
    2009 IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning
  • 2009
TLDR
iLDP can be considered a generalization of Differential Dynamic Programming, inasmuch as it uses general basis functions rather than quadratics to approximate the optimal value function and introduces a collocation method that dispenses with explicit differentiation of the cost and dynamics.
Tube‐based robust nonlinear model predictive control
This paper extends tube‐based model predictive control of linear systems to achieve robust control of nonlinear systems subject to additive disturbances. A central or reference trajectory is
Decentralized POMDPs
TLDR
This chapter presents an overview of the decentralized POMDP (Dec-POMDP) framework, and covers the forward heuristic search approach to solving Dec-PomDPs, as well as the backward dynamic programming approach.
...
...