Corpus ID: 3178672

Variance Adjusted Actor Critic Algorithms

@article{Tamar2013VarianceAA,
  title={Variance Adjusted Actor Critic Algorithms},
  author={Aviv Tamar and Shie Mannor},
  journal={ArXiv},
  year={2013},
  volume={abs/1310.3697}
}
We present an actor-critic framework for MDPs where the objective is the variance-adjusted expected return. Our critic uses linear function approximation, and we extend the concept of compatible features to the variance-adjusted setting. We present an episodic actor-critic algorithm and show that it converges almost surely to a locally optimal point of the objective function. 
Variance Penalized On-Policy and Off-Policy Actor-Critic
Variance-constrained actor-critic algorithms for discounted and average reward MDPs
Risk-Sensitive Deep RL: Variance-Constrained Actor-Critic Provably Finds Globally Optimal Policy
Continuous‐Time Mean–Variance Portfolio Selection: A Reinforcement Learning Framework
Continuous-Time Mean-Variance Portfolio Optimization via Reinforcement Learning
Reward Constrained Policy Optimization
Risk-Constrained Reinforcement Learning with Percentile Risk Criteria
Directly Estimating the Variance of the {\lambda}-Return Using Temporal-Difference Methods
Stochastically Dominant Distributional Reinforcement Learning
...
1
2
3
...

References

SHOWING 1-10 OF 26 REFERENCES
Actor-Critic Algorithms for Risk-Sensitive MDPs
Natural Actor-Critic
A Survey of Actor-Critic Reinforcement Learning: Standard and Natural Policy Gradients
Policy Gradients with Variance Related Risk Criteria
Algorithmic aspects of mean-variance optimization in Markov decision processes
Simulation-based optimization of Markov reward processes
Neuro-Dynamic Programming
...
1
2
3
...