Using goal-driven deep learning models to understand sensory cortex

@article{Yamins2016UsingGD,
  title={Using goal-driven deep learning models to understand sensory cortex},
  author={Daniel Yamins and James J. DiCarlo},
  journal={Nature Neuroscience},
  year={2016},
  volume={19},
  pages={356-365}
}
Fueled by innovation in the computer vision and artificial intelligence communities, recent developments in computational neuroscience have used goal-driven hierarchical convolutional neural networks (HCNNs) to make strides in modeling neural single-unit and population responses in higher visual cortical areas. In this Perspective, we review the recent progress in a broader modeling context and describe some of the key technical innovations that have supported it. We then outline how the goal… 

How can deep learning advance computational modeling of sensory information processing?

TLDR
It is discussed how DNNs are amenable to new model comparison techniques that allow for stronger conclusions to be made about the computational mechanisms underlying sensory information processing.

A Residual Neural-Network Model to Predict Visual Cortex Measurements

TLDR
The asset of using a residual neural network of only 20 layers for this task is that earlier stages of the network can be more easily trained, which allows us to add more layers at the earlier stage.

Learning object representations in deep neural networks

TLDR
This thesis investigates the role of training algorithms (supervised versus unsupervised) on the representational similarity between the computational models and brain data from human inferior temporal cortex and shows that one implementation of unsuper supervised contrastive learning yields more brain-like representations than the selected supervised learning method.

A shallow residual neural network to predict the visual cortex response

TLDR
The asset of using a shallow residual neural network for this task is that earlier stages of the network can be accurately trained, which allows us to add more layers at the earlier stage and the prediction of the visual brain activity improves.

Deep Reinforcement Learning Models Predict Visual Responses in the Brain: A Preliminary Result

TLDR
This work uses reinforcement learning to train neural network models to play a 3D computer game and finds that these reinforcement learning models achieve neural response prediction accuracy scores in the early visual areas in the levels that are comparable to those accomplished by the supervised neural network model.

Guiding visual attention in deep convolutional neural networks based on human eye movements

TLDR
This study uses human eye tracking data to directly modify training examples and thereby guide the models’ visual attention during object recognition in natural images either toward or away from the focus of human fixations, and demonstrates that the proposed guided focus manipulation works as intended and non-human-like models focus on significantly dissimilar image parts compared to humans.

Deep learning and the Global Workspace Theory

Generalization in data-driven models of primary visual cortex

TLDR
This model with its novel readout sets a new state-of-the-art for neural response prediction in mouse visual cortex from natural images, generalizes between animals, and captures better characteristic cortical features than current task-driven pre-training approaches such as VGG16.
...

References

SHOWING 1-10 OF 74 REFERENCES

Human-level control through deep reinforcement learning

TLDR
This work bridges the divide between high-dimensional sensory inputs and actions, resulting in the first artificial agent that is capable of learning to excel at a diverse array of challenging tasks.

Performance-optimized hierarchical models predict neural responses in higher visual cortex

TLDR
This work uses computational techniques to identify a high-performing neural network model that matches human performance on challenging object categorization tasks and shows that performance optimization—applied in a biologically appropriate model class—can be used to build quantitative predictive models of neural processing.

Deep Neural Networks Reveal a Gradient in the Complexity of Neural Representations across the Ventral Stream

TLDR
It is quantitatively shown that there indeed exists an explicit gradient for feature complexity in the ventral pathway of the human brain, and this provides strong support for the hypothesis that object categorization is a guiding principle in the functional organization of the primate ventral stream.

Hierarchical Modular Optimization of Convolutional Networks Achieves Representations Similar to Macaque IT and Human Ventral Stream

TLDR
This work construct models of the ventral stream using a novel optimization procedure for category-level object recognition problems, and produce RDMs resembling both macaque IT and human ventralstream, which develops a long-standing functional hypothesis that the Ventral visual stream is a hierarchically arranged series of processing stages optimized for visual object recognition.

Deep Neural Networks Rival the Representation of Primate IT Cortex for Core Visual Object Recognition

TLDR
These evaluations show that, unlike previous bio-inspired models, the latest DNNs rival the representational performance of IT cortex on this visual object recognition task and propose an extension of “kernel analysis” that measures the generalization accuracy as a function of representational complexity.

Hierarchical models of object recognition in cortex

TLDR
A new hierarchical model consistent with physiological data from inferotemporal cortex that accounts for this complex visual task and makes testable predictions is described.

Going deeper with convolutions

We propose a deep convolutional neural network architecture codenamed Inception that achieves the new state of the art for classification and detection in the ImageNet Large-Scale Visual Recognition

Deep Supervised, but Not Unsupervised, Models May Explain IT Cortical Representation

TLDR
The results suggest that explaining IT requires computational features trained through supervised learning to emphasize the behaviorally important categorical divisions prominently reflected in IT.

Context-dependent computation by recurrent dynamics in prefrontal cortex

TLDR
This work studies prefrontal cortex activity in macaque monkeys trained to flexibly select and integrate noisy sensory inputs towards a choice, and finds that the observed complexity and functional roles of single neurons are readily understood in the framework of a dynamical process unfolding at the level of the population.

Metamers of the ventral stream

TLDR
A population model for mid-ventral processing is developed, in which nonlinear combinations of V1 responses are averaged in receptive fields that grow with eccentricity, providing a quantitative framework for assessing the capabilities and limitations of everyday vision.
...