• Corpus ID: 474635

Pixels to Voxels: Modeling Visual Representation in the Human Brain

@article{Agrawal2014PixelsTV,
  title={Pixels to Voxels: Modeling Visual Representation in the Human Brain},
  author={Pulkit Agrawal and Dustin Stansbury and Jitendra Malik and Jack L. Gallant},
  journal={ArXiv},
  year={2014},
  volume={abs/1407.5104}
}
The human brain is adept at solving difficult high-level visual processing problems such as image interpretation and object recognition in natural scenes. Over the past few years neuroscientists have made remarkable progress in understanding how the human brain represents categories of objects and actions in natural scenes. However, all current models of high-level human vision operate on hand annotated images in which the objects and actions have been assigned semantic tags by a human operator… 
Basic Level Categorization Facilitates Visual Object Recognition
TLDR
This work proposes a network optimization strategy inspired by both of the developmental trajectory of children's visual object recognition capabilities, and Bar (2003), who hypothesized that basic level information is carried in the fast magnocellular pathway through the prefrontal cortex (PFC) and then projected back to inferior temporal cortex (IT), where subordinate level categorization is achieved.
Using features from deep neural networks to model human categorization of natural images
TLDR
It is shown that representations derived from CNNs can be used to extend these models to natural images, and that a group of cognitive models based on these representations capture human behavior over a novel crowdsourced database of >500,000 human classification decisions.
Comparison of deep neural networks to spatio-temporal cortical dynamics of human visual object recognition reveals hierarchical correspondence
TLDR
It was shown that the DNN captured the stages of human visual processing in both time and space from early visual areas towards the dorsal and ventral streams and provided an algorithmically informed view on the spatio-temporal dynamics of visual object recognition in the human visual brain.
Deep Neural Networks predict Hierarchical Spatio-temporal Cortical Dynamics of Human Visual Object Recognition
TLDR
It was shown that the DNN captured the stages of human visual processing in both time and space from early visual areas towards the dorsal and ventral streams and provided an algorithmically informed view on the spatio-temporal dynamics of visual object recognition in the human visual brain.
Do Deep Neural Networks Model Nonlinear Compositionality in the Neural Representation of Human-Object Interactions?
TLDR
The results suggest that a typical DNN representation contains encoding of compositional information for human-object interactions which goes beyond a linear combination of the encodings for the two components, thus suggesting that DNNs may be able to model this important property of biological vision.
Basic Level Categorizatiandon is Important: Modelling the Facilitation in Visual Object Recognition
TLDR
This work proposes a network optimization strategy inspired by Bar (2003), with the hypothesis that performing basic-level object categorization task first can facilitate the subordinate-level categorizationtask.
Different Goal-driven CNNs Affect Performance of Visual Encoding Models based on Deep Learning
TLDR
Classification task in computer vision can better fit the visual mechanism of human compared to visual segmentation task, and the VGG-based encoding model can achieve a higher performance in most voxels of ROIs.
Shape Features Improve the Encoding Performance of High-level Visual Cortex
TLDR
The comparative analysis of different visual areas indicates that the visual encoding model constructed by CNN that learns shape features can improve the encoding performance.
Computational mechanisms underlying cortical responses to the affordance properties of visual scenes
TLDR
A set of techniques for using CNNs to gain insights into the computational mechanisms underlying cortical responses, which map the sensory functions of the OPA onto a fully quantitative model that provides insights into its visual computations.
From convolutional neural networks to models of higher‐level cognition (and back again)
TLDR
This paper asks whether CNNs might also provide a basis for modeling higher‐level cognition, focusing on the core phenomena of similarity and categorization, and a toolkit for the integration of cognitively motivated constraints back into CNN training paradigms in computer vision and machine learning.
...
...

References

SHOWING 1-10 OF 28 REFERENCES
Bayesian Reconstruction of Natural Images from Human Brain Activity
Predicting neuronal responses during natural vision
TLDR
A general framework for fitting and validating nonlinear models of visual neurons using natural visual stimuli and compares two models for neurons in primary visual cortex: a nonlinear Fourier power model, which accounts for spatial phase invariant tuning, and a traditional linear model.
Functional delineation of the human occipito-temporal areas related to face and scene processing. A PET study.
TLDR
The results suggest that the right temporal pole is activated during the discrimination of familiar faces and scenes from unfamiliar ones, and is probably involved in the recognition of familiar objects.
The Fusiform Face Area: A Module in Human Extrastriate Cortex Specialized for Face Perception
TLDR
The data allow us to reject alternative accounts of the function of the fusiform face area (area “FF”) that appeal to visual attention, subordinate-level classification, or general processing of any animate or human forms, demonstrating that this region is selectively involved in the perception of faces.
Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation
TLDR
This paper proposes a simple and scalable detection algorithm that improves mean average precision (mAP) by more than 30% relative to the previous best result on VOC 2012 -- achieving a mAP of 53.3%.
A cortical representation of the local visual environment
TLDR
Evidence is presented that a particular area within human parahippocampal cortex is involved in a critical component of navigation: perceiving the local visual environment, and it is proposed that the PPA represents places by encoding the geometry of the local environment.
Location and spatial profile of category‐specific regions in human extrastriate cortex
TLDR
The anatomical consistency of each of the functionally defined regions across subjects and the spatial sharpness of their activation profiles within subjects highlight the fact that these regions constitute replicable and distinctive landmarks in the functional organization of the human brain.
The Pascal Visual Object Classes (VOC) Challenge
TLDR
The state-of-the-art in evaluated methods for both classification and detection are reviewed, whether the methods are statistically different, what they are learning from the images, and what the methods find easy or confuse.
...
...