Reinforcement learning and its connections with neuroscience and psychology.

  title={Reinforcement learning and its connections with neuroscience and psychology.},
  author={Ajay Subramanian and Sharad Chitlangia and Veeky Baths},
  journal={Neural networks : the official journal of the International Neural Network Society},

Figures and Tables from this paper

A Novel Multi-Factor Three-Step Feature Selection and Deep Learning Framework for Regional GDP Prediction: Evidence from China

A novel three-stage feature selection and deep learning framework for regional GDP prediction has achieved better forecasting performance than 14 benchmark models and the MAPE values of the models are lower than 5% in all cases.

Smart Scheduling of Electric Vehicles Based on Reinforcement Learning

This work has created and trained a neural network that uses a history of previous situations to identify the optimal charging station and time interval for recharging, and is planning to train the DQN model using real-world data.

Using Deep Reinforcement Learning for Zero Defect Smart Forging

A digital twin-based optimization strategy for the heating process for a forging line to automate the development of an optimal control policy that adjusts the power for the cooling coils in an induction oven based on temperature data observed from pyrometers is developed.

AI-Olympics: A Multi-Agent Benchmark for Evaluation of Active Inference

This work introduces AI-Olympics benchmark, a 2D physical simulator build from scratch along with multiple partially-observed multi-agent game scenarios, and explores whether such internal abstraction can help generalizing decision-making across multiple tasks.



A Distributional Perspective on Reinforcement Learning

This paper argues for the fundamental importance of the value distribution: the distribution of the random return received by a reinforcement learning agent, and designs a new algorithm which applies Bellman's equation to the learning of approximate value distributions.

Reinforcement Learning

  • P. Stone
  • Psychology
    Encyclopedia of Machine Learning and Data Mining
  • 2010
Questions 1. Consider the comparison between ε-greedy methods shown in Figure 2.1 in the Sutton and Barto book. Which method will perform best in the long run in terms of cumulative rewards and

Reinforcement Learning: An Introduction

This book provides a clear and simple account of the key ideas and algorithms of reinforcement learning, which ranges from the history of the field's intellectual foundations to the most recent developments and applications.

Behavioral and Neural Changes after Gains and Losses of Conditioned Reinforcers

Primate medial frontal cortex might mediate the behavioral effects of conditioned reinforcers and their losses and many neurons in the dorsomedial frontal cortex also signaled the gain or loss occurring as a result of choosing a particular action as well as changes in the animal's behaviors resulting from such gains or losses.

Neural Correlates of Biased Competition in Premotor Cortex

Recording single-unit activity in dorsal premotor cortex while a monkey performed two delayed center-out reaching tasks suggested that decisions between actions are made through a biased competition taking place within a sensorimotor map of potential actions.

Role of Striatum in Updating Values of Chosen Actions

The results suggest that the striatum not only represents values associated with potential choices before animal's choice of action, but might also update the value of chosen action once its outcome is revealed.

Neurons in the orbitofrontal cortex encode economic value

Neurons in the orbitofrontal cortex (OFC) encode the value of offered and chosen goods during economic choice, suggesting that economic choice is essentially choice between goods rather than choice between actions.

A Neural Substrate of Prediction and Reward

Findings in this work indicate that dopaminergic neurons in the primate whose fluctuating output apparently signals changes or errors in the predictions of future salient and rewarding events can be understood through quantitative theories of adaptive optimizing control.

A framework for mesencephalic dopamine systems based on predictive Hebbian learning

A theoretical framework is developed that shows how mesencephalic dopamine systems could distribute to their targets a signal that represents information about future expectations and shows that, through a simple influence on synaptic plasticity, fluctuations in dopamine release can act to change the predictions in an appropriate manner.