Back to optimality: a formal framework to express the dynamics of learning optimal behavior

@article{Alonso2015BackTO,
  title={Back to optimality: a formal framework to express the dynamics of learning optimal behavior},
  author={Eduardo Alonso and Michael Fairbank and Esther Mondrag{\'o}n},
  journal={Adaptive Behavior},
  year={2015},
  volume={23},
  pages={206 - 215}
}
Whether animals behave optimally is an open question of great importance, both theoretically and in practice. Attempts to answer this question focus on two aspects of the optimization problem, the quantity to be optimized and the optimization process itself. In this paper, we assume the abstract concept of cost as the quantity to be minimized and propose a reinforcement learning algorithm, called Value-Gradient Learning (VGL), as a computational model of behavior optimality. We prove that… 

A reward-driven model of Darwinian fitness

TLDR
A model that, based on the principle of total energy balance, bridges the gap between Darwinian fitness theories and reward-driven theories of behaviour and results show that it is possible to accommodate the reward maximization principle underlying modern approaches in behavioural reinforcement learning and traditional fitness approaches.

References

SHOWING 1-10 OF 54 REFERENCES

An Equivalence Between Adaptive Dynamic Programming With a Critic and Backpropagation Through Time

TLDR
Value-Gradient Learning is extended into a new algorithm that is called VGL(λ), and equivalence of an instance of the new algorithm to Backpropagation Through Time for Control with a greedy policy is proved, enabling this variant of DHP to have guaranteed convergence.

Optimal Control Theory

TLDR
Of special interest in the context of this book is the material on the duality of optimal control and probabilistic inference; such duality suggests that neural information processing in sensory and motor areas may be more similar than currently thought.

Reinforcement Learning: An Introduction

TLDR
This book provides a clear and simple account of the key ideas and algorithms of reinforcement learning, which ranges from the history of the field's intellectual foundations to the most recent developments and applications.

Temporal Sequence Learning, Prediction, and Control: A Review of Different Models and Their Relation to Biological Mechanisms

TLDR
This review compares methods for temporal sequence learning (TSL) across the disciplines machine-control, classical conditioning, neuronal models for TSL as well as spike-timing-dependent plasticity (STDP) and focuses on to what degree are reward-based and correlation-based learning related.

Is Animal Learning Optimal

TLDR
Economics is a pretty successful social science and, in the 1970s, psychologists and behavioral ecologists had high hopes that these techniques could shed light on animal learning and behavior, but this hope had to be given up.

A primer on reinforcement learning in the brain : Psychological, computational, and neural perspectives

In the last 15 years, there has been a flourishing of research into the neural basis of reinforcement learning, drawing together insights and findings from psychology, computer science, and

Value-gradient learning

TLDR
An Adaptive Dynamic Programming algorithm VGL(λ) for learning a critic function over a large continuous state space is described and the theoretical relationships and motivations of using this method over its precursor algorithms Dual Heuristic Dynamic Programming and TD(λ).

The divergence of reinforcement learning algorithms with value-iteration and function approximation

This paper gives specific divergence examples of value-iteration for several major Reinforcement Learning and Adaptive Dynamic Programming algorithms, when using a function approximator for the value

Regulation during challenge: A general model of learned performance under schedule constraint.

This article develops a general behavior-regulation model of learned performance related to the equilibrium approach of Timberlake ( 1980) and Timberlake and Allison ( 1974). The model is based on

Dynamic Programming

...