Short-term plasticity as cause–effect hypothesis testing in distal reward learning

@article{Soltoggio2014ShorttermPA,
  title={Short-term plasticity as cause–effect hypothesis testing in distal reward learning},
  author={Andrea Soltoggio},
  journal={Biological Cybernetics},
  year={2014},
  volume={109},
  pages={75-94}
}
  • A. Soltoggio
  • Published 4 February 2014
  • Psychology
  • Biological Cybernetics
Asynchrony, overlaps, and delays in sensory–motor signals introduce ambiguity as to which stimuli, actions, and rewards are causally related. Only the repetition of reward episodes helps distinguish true cause–effect relationships from coincidental occurrences. In the model proposed here, a novel plasticity rule employs short- and long-term changes to evaluate hypotheses on cause–effect relationships. Transient weights represent hypotheses that are consolidated in long-term memory only when… 
Solving the Distal Reward Problem with Rare Correlations
TLDR
The hypothesis that short-term synaptic plasticity may implement eligibility traces and thereby serve as a selection mechanism in promoting candidate synapses for long-term storage is suggested.
Editorial: Neural plasticity for rich and uncertain robotic information streams
TLDR
This Research Topic in Frontiers in Neurorobotics explored fundamental properties and dynamics of neural learning systems that are naturally immersed in a rich information flow and addressed key topics in this emerging and important field of research.
A Spiking Neural Model for Stable Reinforcement of Synapses Based on Multiple Distal Rewards
In this letter, a novel critic-like algorithm was developed to extend the synaptic plasticity rule described in Florian (2007) and Izhikevich (2007) in order to solve the problem of learning multiple
Neural plasticity for rich and uncertain robotic information streams
TLDR
Realistic and more complete models of plasticity must account for delayed rewards, noisy and ambiguous data, emerging and novel input features during online learning, in order to extract autonomously relationships and features in time and space.
Fully Autonomous Real-Time Autoencoder-Augmented Hebbian Learning through the Collection of Novel Experiences
TLDR
This investigation shows how RAAHN can learn to navigate from scratch entirely on its own, without an autopilot, and becomes a powerful new Hebbian-centered approach to learning from sparse reinforcement with broad potential applications.
Online Fast Adaptation and Knowledge Accumulation: a New Approach to Continual Learning
TLDR
It is empirically shown that Continual-MAML, an online extension of the popular MAML algorithm, is better suited to the new scenario of OSAKA, where an agent must quickly solve new (out-of-distribution) tasks, while also requiring fast remembering.
Online Fast Adaptation and Knowledge Accumulation (OSAKA): a New Approach to Continual Learning
TLDR
It is shown in an empirical study that ContinualMAML, an online extension of the popular MAML algorithm, is better suited to the new scenario than the aforementioned methodologies including standard continual learning and meta-learning approaches.
...
...

References

SHOWING 1-10 OF 135 REFERENCES
Solving the Distal Reward Problem with Rare Correlations
TLDR
The hypothesis that short-term synaptic plasticity may implement eligibility traces and thereby serve as a selection mechanism in promoting candidate synapses for long-term storage is suggested.
Solving the distal reward problem through linkage of STDP and dopamine signaling
TLDR
This study emphasizes the importance of precise firing patterns in brain dynamics and suggests how a global diffusive reinforcement signal in the form of extracellular DA can selectively influence the right synapses at the right time.
Functional Requirements for Reward-Modulated Spike-Timing-Dependent Plasticity
TLDR
It is shown that reward-modulated spike-timing-dependent plasticity is capable of learning motor trajectories with a temporal resolution of tens of milliseconds and the relation to temporal difference learning, the relevance of block-based learning paradigms, and the limitations of learning with a critic are discussed.
Synaptic and spiking dynamics underlying reward reversal in the orbitofrontal cortex.
TLDR
A model at the integrate-and-fire neuronal level of the synaptic and spiking mechanisms which can hold an expectation of a reward rule in working memory, and can reverse the reward rule if expected rewards are not obtained is described.
Rare Neural Correlations Implement Robotic Conditioning with Delayed Rewards and Disturbances
TLDR
The present study demonstrates the use of rare neural correlations in making correct associations between rewards and previous cues or actions, and models realistic conditioning in humans and animals and implements similar behaviors in neuro-robotic platforms.
An Embodied Model of Learning, Plasticity, and Reward
TLDR
A neural network model of the dopaminergic system based on observed anatomical and physiological properties of the primate midbrain is described and discussed, leading to the conditioning of reward-related behaviors through the interaction between the robot and its surroundings.
The dynamics of operant conditioning.
TLDR
A minimal set of principles based on short-term and long-term memory mechanisms that can explain the major static and dynamic properties of operant behavior in both single-choice and multiresponse situations are proposed.
Spatio-Temporal Credit Assignment in Neuronal Population Learning
TLDR
This work presents a model of plasticity induction for reinforcement learning in a population of leaky integrate and fire neurons which is based on a cascade of synaptic memory traces and argues that, due to their comparative robustness, synaptic plasticity cascades are attractive basic models of reinforcementlearning in the brain.
What is reinforced by phasic dopamine signals?
...
...