• Publications
  • Influence
Q-learning
TLDR
This paper presents and proves in detail a convergence theorem forQ-learning based on that outlined in Watkins (1989), showing that Q-learning converges to the optimum action-values with probability 1 so long as all actions are repeatedly sampled in all states and the action- values are represented discretely.
A Neural Substrate of Prediction and Reward
TLDR
Findings in this work indicate that dopaminergic neurons in the primate whose fluctuating output apparently signals changes or errors in the predictions of future salient and rewarding events can be understood through quantitative theories of adaptive optimizing control.
Theoretical Neuroscience: Computational and Mathematical Modeling of Neural Systems
TLDR
This text introduces the basic mathematical and computational methods of theoretical neuroscience and presents applications in a variety of areas including vision, sensory-motor integration, development, learning, and memory.
Technical Note: Q-Learning
TLDR
A convergence theorem is presented and proves that Q -learning converges to the optimum action-values with probability 1 so long as all actions are repeatedly sampled in all states and the action- values are represented discretely.
Uncertainty-based competition between prefrontal and dorsolateral striatal systems for behavioral control
TLDR
This work considers dual-action choice systems from a normative perspective, and suggests a Bayesian principle of arbitration between them according to uncertainty, so each controller is deployed when it should be most accurate.
Cortical substrates for exploratory decisions in humans
TLDR
It is shown, in a gambling task, that human subjects' choices can be characterized by a computationally well-regarded strategy for addressing the explore/exploit dilemma, and a model of action selection under uncertainty that involves switching between exploratory and exploitative behavioural modes is suggested.
Dissociable Roles of Ventral and Dorsal Striatum in Instrumental Conditioning
TLDR
This work scanned human participants with functional magnetic resonance imaging while they engaged in instrumental conditioning to suggest partly dissociable contributions of the ventral and dorsal striatum to the critic and the actor.
Model-based influences on humans’ choices and striatal prediction errors
TLDR
A multistep decision task designed to challenge the notion of a separate model-free learner and suggest a more integrated computational architecture for high-level human decision-making.
A framework for mesencephalic dopamine systems based on predictive Hebbian learning
TLDR
A theoretical framework is developed that shows how mesencephalic dopamine systems could distribute to their targets a signal that represents information about future expectations and shows that, through a simple influence on synaptic plasticity, fluctuations in dopamine release can act to change the predictions in an appropriate manner.
Uncertainty, Neuromodulation, and Attention
TLDR
This formulation is consistent with a wealth of physiological, pharmacological, and behavioral data implicating acetylcholine and norepinephrine in specific aspects of a range of cognitive processes and suggests a class of attentional cueing tasks that involve both neuromodulators and how their interactions may be part-antagonistic, part-synergistic.
...
1
2
3
4
5
...