Skip to search form
Skip to main content
Skip to account menu
Semantic Scholar
Semantic Scholar's Logo
Search 217,275,345 papers from all fields of science
Search
Sign In
Create Free Account
Bellman equation
Known as:
Bellman-Equation
, Bellman's optimality principle
, Policy function
Expand
A Bellman equation, named after its discoverer, Richard Bellman, also known as a dynamic programming equation, is a necessary condition for…
Expand
Wikipedia
(opens in a new tab)
Create Alert
Alert
Related topics
Related topics
35 relations
Algebraic Riccati equation
Artificial neural network
Automatic basis function construction
Backward induction
Expand
Broader (2)
Control theory
Dynamic programming
Papers overview
Semantic Scholar uses AI to extract papers important to this topic.
Highly Cited
2015
Highly Cited
2015
Dueling Network Architectures for Deep Reinforcement Learning
Ziyun Wang
,
T. Schaul
,
Matteo Hessel
,
H. V. Hasselt
,
Marc Lanctot
,
Nando de Freitas
International Conference on Machine Learning
2015
Corpus ID: 5389801
In recent years there have been many successes of using deep representations in reinforcement learning. Still, many of these…
Expand
Highly Cited
2014
Highly Cited
2014
Deterministic Policy Gradient Algorithms
David Silver
,
Guy Lever
,
N. Heess
,
T. Degris
,
Daan Wierstra
,
Martin A. Riedmiller
International Conference on Machine Learning
2014
Corpus ID: 13928442
In this paper we consider deterministic policy gradient algorithms for reinforcement learning with continuous actions. The…
Expand
Highly Cited
2013
Highly Cited
2013
Playing Atari with Deep Reinforcement Learning
Volodymyr Mnih
,
K. Kavukcuoglu
,
+4 authors
Martin A. Riedmiller
arXiv.org
2013
Corpus ID: 15238391
We present the first deep learning model to successfully learn control policies directly from high-dimensional sensory input…
Expand
Highly Cited
2011
Highly Cited
2011
Sampling-based algorithms for optimal motion planning
S. Karaman
,
Emilio Frazzoli
Int. J. Robotics Res.
2011
Corpus ID: 14876957
During the last decade, sampling-based path planning algorithms, such as probabilistic roadmaps (PRM) and rapidly exploring…
Expand
Highly Cited
1999
Highly Cited
1999
Policy Gradient Methods for Reinforcement Learning with Function Approximation
R. Sutton
,
David A. McAllester
,
Satinder Singh
,
Y. Mansour
Neural Information Processing Systems
1999
Corpus ID: 1211821
Function approximation is essential to reinforcement learning, but the standard approach of approximating a value function and…
Expand
Highly Cited
1997
Highly Cited
1997
Optimal Control and Viscosity Solutions of Hamilton-Jacobi-Bellman Equations
M. Bardi
,
I. Capuzzo-Dolcetta
1997
Corpus ID: 117460677
Preface.- Basic notations.- Outline of the main ideas on a model problem.- Continuous viscosity solutions of Hamilton-Jacobi…
Expand
Highly Cited
1997
Highly Cited
1997
Galerkin approximations of the generalized Hamilton-Jacobi-Bellman equation
R. Beard
,
G. Saridis
,
J. Wen
at - Automatisierungstechnik
1997
Corpus ID: 18774400
Highly Cited
1987
Highly Cited
1987
AGGREGATION AND LINEARITY IN THE PROVISION OF INTERTEMPORAL INCENTIVES
Bengt R. Holmstrom
,
Paul R. Milgrom
1987
Corpus ID: 67784731
The authors develop two themes in the theory of incentive schemes. First, one need not always use all of the information…
Expand
Highly Cited
1985
Highly Cited
1985
Mental Accounting and Consumer Choice
R. Thaler
Marketing science (Providence, R.I.)
1985
Corpus ID: 6137713
A new model of consumer behavior is developed using a hybrid of cognitive psychology and microeconomics. The development of the…
Expand
Highly Cited
1982
Highly Cited
1982
Solution of the Schrödinger equation by a spectral method
M. Feit
,
J. A. Fleck
,
A. Steiger
1982
Corpus ID: 14101845
By clicking accept or continuing to use the site, you agree to the terms outlined in our
Privacy Policy
(opens in a new tab)
,
Terms of Service
(opens in a new tab)
, and
Dataset License
(opens in a new tab)
ACCEPT & CONTINUE