A Learning Theory for Reward-Modulated Spike-Timing-Dependent Plasticity with Application to Biofeedback

@article{Legenstein2008ALT,
  title={A Learning Theory for Reward-Modulated Spike-Timing-Dependent Plasticity with Application to Biofeedback},
  author={R. Legenstein and Dejan Pecevski and W. Maass},
  journal={PLoS Computational Biology},
  year={2008},
  volume={4}
}
Reward-modulated spike-timing-dependent plasticity (STDP) has recently emerged as a candidate for a learning rule that could explain how behaviorally relevant adaptive changes in complex networks of spiking neurons could be achieved in a self-organizing manner through local synaptic plasticity. However, the capabilities and limitations of this learning rule could so far only be tested through computer simulations. This article provides tools for an analytic treatment of reward-modulated STDP… Expand
Theoretical Analysis of Learning with Reward-Modulated Spike-Timing-Dependent Plasticity
TLDR
These tools are provided to predict under which conditions reward-modulated STDP will be able to achieve a desired learning effect and to produce a theoretical explanation and a computer model for a fundamental experimental finding on biofeedback in monkeys. Expand
Functional Requirements for Reward-Modulated Spike-Timing-Dependent Plasticity
TLDR
It is shown that reward-modulated spike-timing-dependent plasticity is capable of learning motor trajectories with a temporal resolution of tens of milliseconds and the relation to temporal difference learning, the relevance of block-based learning paradigms, and the limitations of learning with a critic are discussed. Expand
RM-SORN: a reward-modulated self-organizing recurrent neural network
TLDR
A neural network model that learns through the interaction of Intrinsic Plasticity and reward-modulated Spike-Timing-Dependent Plasticity, which achieves performance comparable to networks trained with supervised learning, while using simple, biologically motivated plasticity rules, and rewarding strategies. Expand
A Spiking Network Model of Decision Making Employing Rewarded STDP
TLDR
It is predicted that successful learning requires stabilizing mechanisms that allow neurons to balance their input and output synapses as well as synaptic noise. Expand
Reinforcement Learning Using a Continuous Time Actor-Critic Framework with Spiking Neurons
TLDR
In simulations, this model can solve a Morris water-maze-like navigation task, in a number of trials consistent with reported animal performance, and the analytically derived learning rule is consistent with experimental evidence for dopamine-modulated spike-timing-dependent plasticity. Expand
Multi-layer network utilizing rewarded spike time dependent plasticity to learn a foraging task
TLDR
The study predicts a critical set of properties of the spiking neuronal network with STDP that was sufficient to solve a complex foraging task involving pattern classification and decision making. Expand
Reward-modulated spike-timing-dependent plasticity with a dynamic spike timing rule and inhibitory plasticity
The viability of spike-timing-dependent plasticity (STDP) to explain learning processes is controversial, although recent developments of reward-modulated STDP (RM-STDP) models provide a plausibleExpand
Spatio-temporal event association using reward-modulated spike-time-dependent plasticity
TLDR
A network is able to associate a pair of events with a temporal delay in a dynamic setting as a result of learning, and convergence is measured by synchronicity of activities in associated neuronal groups. Expand
Acetylcholine-modulated plasticity in reward-driven navigation: a computational study
TLDR
It is found that acetylcholine, by allowing learning from negative outcomes, enhances exploration over the action space, and sequentially neuromodulated Spike-Timing-Dependent Plasticity (STDP) yields flexible learning, surpassing the performance of other reward-modulated plasticity rules. Expand
Sequential neuromodulation of Hebbian plasticity offers mechanism for effective reward-based navigation
TLDR
It is demonstrated that sequential neuromodulation of STDP by acetylcholine and dopamine offers an efficacious model of reward-based navigation, and also provides a possible mechanism for aligning the time scales of cellular and behavioral learning. Expand
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 57 REFERENCES
Reinforcement Learning Through Modulation of Spike-Timing-Dependent Synaptic Plasticity
  • R. Florian
  • Psychology, Computer Science
  • Neural Computation
  • 2007
TLDR
It is shown that the modulation of STDP by a global reward signal leads to reinforcement learning, and analytically learning rules involving reward-modulated spike-timing-dependent synaptic and intrinsic plasticity are derived, which may be used for training generic artificial spiking neural networks, regardless of the neural model used. Expand
Reinforcement learning with modulated spike timing dependent synaptic plasticity.
TLDR
This model offers a novel and biologically plausible implementation of reinforcement learning that is capable of training a neural population to produce a very wide range of possible mappings between synaptic input and spiking output. Expand
Reinforcement Learning, Spike-Time-Dependent Plasticity, and the BCM Rule
TLDR
A recently introduced policy learning algorithm from machine learning is applied to networks of spiking neurons and derived a spike-time-dependent plasticity rule that ensures convergence to a local optimum of the expected average reward. Expand
Solving the distal reward problem through linkage of STDP and dopamine signaling
TLDR
This study emphasizes the importance of precise firing patterns in brain dynamics and suggests how a global diffusive reinforcement signal in the form of extracellular DA can selectively influence the right synapses at the right time. Expand
Optimal Spike-Timing-Dependent Plasticity for Precise Action Potential Firing in Supervised Learning
TLDR
A supervised learning paradigm is used to derive a synaptic update rule that optimizes by gradient ascent the likelihood of postsynaptic firing at one or several desired firing times, and finds that the optimal strategy of up- and downregulating synaptic efficacies depends on the relative timing between presynaptic spike arrival and desired post Synaptic firing. Expand
Competitive Hebbian learning through spike-timing-dependent synaptic plasticity
TLDR
In modeling studies, it is found that this form of synaptic modification can automatically balance synaptic strengths to make postsynaptic firing irregular but more sensitive to presynaptic spike timing. Expand
The tempotron: a neuron that learns spike timing–based decisions
TLDR
This work proposes a new, biologically plausible supervised synaptic learning rule that enables neurons to efficiently learn a broad range of decision rules, even when information is embedded in the spatiotemporal structure of spike patterns rather than in mean firing rates. Expand
A cellular mechanism of reward-related learning
TLDR
It is proposed that stimulation of the substantia nigra when the lever is pressed induces a similar potentiation of cortical inputs to the striatum, positively reinforcing the learning of the behaviour by the rats. Expand
Spike-Timing-Dependent Plasticity in Balanced Random Networks
TLDR
A novel STDP update rule is proposed, with a multiplicative dependence on the synaptic weight for depression, and a power law dependence for potentiation, and it is shown that this rule, when implemented in large, balanced networks of realistic connectivity and sparseness, is compatible with the asynchronous irregular activity regime. Expand
Spike Timing-Dependent Synaptic Depression in the In Vivo Barrel Cortex of the Rat
TLDR
It is demonstrated that spike timing-dependent synaptic depression occurs in S1 in vivo, and is therefore a plausible plasticity mechanism in the sensory cortex. Expand
...
1
2
3
4
5
...