A Neural Substrate of Prediction and Reward

@article{Schultz1997ANS,
  title={A Neural Substrate of Prediction and Reward},
  author={Wolfram Schultz and Peter Dayan and P. Read Montague},
  journal={Science},
  year={1997},
  volume={275},
  pages={1593 - 1599}
}
The capacity to predict future events permits a creature to detect, model, and manipulate the causal structure of its interactions with its environment. Behavioral experiments suggest that learning is driven by changes in the expectations about future salient events such as rewards and punishments. Physiological work has recently complemented these studies by identifying dopaminergic neurons in the primate whose fluctuating output apparently signals changes or errors in the predictions of… 
Reward Responses of Dopamine Neurons: A Biological Reinforcement Signal
TLDR
Neuophysiological experiments have revealed that neurons in the mammalian midbrain using the neurotransmitter dopamine process information about rewards and reward-predicting stimuli in a very similar manner as the teaching signal of TD models.
Model-based predictions for dopamine
A biologically inspired neuronal model of reward prediction error computation
The neurocomputational model described here proposes that two dimensions involved in computation of reward prediction errors i.e magnitude and time could be computed separately and later combined
Learning to represent reward structure: A key to adapting to complex environments
Dopamine reward prediction-error signalling: a two-component response
  • W. Schultz
  • Psychology, Biology
    Nature Reviews Neuroscience
  • 2016
TLDR
It is suggested that an initial brief, unselective and highly sensitive increase in activity unspecifically detects a wide range of environmental stimuli, then quickly evolves into the main response component, which reflects subjective reward value and utility.
The computational neurobiology of learning and reward
  • N. Daw, K. Doya
  • Biology, Psychology
    Current Opinion in Neurobiology
  • 2006
Dopamine: generalization and bonuses
A computational substrate for incentive salience
Dopamine, Reward Prediction Error, and Economics
The neurotransmitter dopamine has been found to play a crucial role in choice, learning, and belief formation. The best-developed current theory of dopaminergic function is the "reward prediction
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 114 REFERENCES
Toward a modern theory of adaptive networks: expectation and prediction.
TLDR
The adaptive element presented learns to increase its response rate in anticipation of increased stimulation, producing a conditioned response before the occurrence of the unconditioned stimulus, and is in strong agreement with the behavioral data regarding the effects of stimulus context.
Importance of unpredictability for reward responses in primate dopamine neurons.
TLDR
D dopamine neurons responded to liquid reward during learning but not when task performance was established, at which time the neuronal response occurred to the conditioned, reward-predicting, movement-triggering stimulus.
A framework for mesencephalic dopamine systems based on predictive Hebbian learning
TLDR
A theoretical framework is developed that shows how mesencephalic dopamine systems could distribute to their targets a signal that represents information about future expectations and shows that, through a simple influence on synaptic plasticity, fluctuations in dopamine release can act to change the predictions in an appropriate manner.
Preferential activation of midbrain dopamine neurons by appetitive rather than aversive stimuli
TLDR
Dopamine neurons preferentially report environmental stimuli with appetitive rather than aversive motivational value, and primary and conditioned non-noxious aversive stimuli either failed to activate dopamine neurons or induced weaker responses than appetitive stimuli.
Building action repertoires: memory and learning functions of the basal ganglia
  • A. Graybiel
  • Biology, Psychology
    Current Opinion in Neurobiology
  • 1995
Bee foraging in uncertain environments using predictive hebbian learning
TLDR
A model of bee foraging in uncertain environments based on a neuron with widespread projections to odour processing regions of the honeybee brain and a predictive form of hebbian synaptic plasticity is constructed, showing how neuromodulatory influences can be used to bias actions and control synaptic Plasticity in a way that goes beyond standard correlational mechanisms.
A Theory of Attention: Variations in the Associability of Stimuli with Reinforcement
TLDR
Overshadowing and blocking are better explained by the choice of an appropriate rule for changing a, such that a decreases to stimuli that signal no change from the probability of reinforcement predicted by other stimuli.
...
1
2
3
4
5
...