Letter perception emerges from unsupervised deep learning and recycling of natural image features

@article{Testolin2017LetterPE,
  title={Letter perception emerges from unsupervised deep learning and recycling of natural image features},
  author={Alberto Testolin and Ivilin Peev Stoianov and Marco Zorzi},
  journal={Nature Human Behaviour},
  year={2017},
  volume={1},
  pages={657-664}
}
The use of written symbols is a major achievement of human cultural evolution. However, how abstract letter representations might be learned from vision is still an unsolved problem1,2. Here, we present a large-scale computational model of letter recognition based on deep neural networks3,4, which develops a hierarchy of increasingly more complex internal representations in a completely unsupervised way by fitting a probabilistic, generative model to the visual input5,6. In line with the… 

Learning representation hierarchies by sharing visual features: a computational investigation of Persian character recognition with unsupervised deep learning

TLDR
A computational model of Persian character recognition based on deep belief networks is presented, where increasingly more complex visual features emerge in a completely unsupervised manner by fitting a hierarchical generative model to the sensory data.

General object-based features account for letter perception

TLDR
Behavioral-computational evidence is provided that the perception of letters depends on general visual features rather than a specialized feature space, and several approaches to alter object-based features with letter specialization did not improve the match to human behavior.

Emergence of a compositional neural code for written words: Recycling of a convolutional neural network for reading

TLDR
A deep convolutional neural network of the ventral visual pathway is trained first to categorize pictures and then to recognize written words invariantly for case, font, and size, and it is shown that the model can account for many properties of the VWFA, particularly when a subset of units possesses a biased connectivity to word output units.

Translucency perception emerges in deep generative representations for natural image synthesis

Material perception is essential in planning interactions with the environment. The visual system relies on diagnostic image features to achieve material perception efficiently. However, discovering

Unsupervised learning predicts human perception and misperception of gloss

TLDR
Linearly decoding specular reflectance from the model’s internal code predicts human gloss perception better than ground truth, supervised networks or control models, and it predicts, on an image-by-image basis, illusions of gloss perception caused by interactions between material, shape and lighting.

Understanding Character Recognition using Visual Explanations Derived from the Human Visual System and Deep Networks

TLDR
This work uses eye-tracking to assay the spatial distribution of information hotspots for humans via fixation maps and an activation mapping technique for obtaining analogous distributions for deep networks through visualization maps and finds an interesting correlate of congruence.

An emergentist perspective on the origin of number sense

TLDR
It is shown that deep neural networks endowed with basic visuospatial processing exhibit a remarkable performance in numerosity discrimination before any experience-dependent learning, whereas unsupervised sensory experience with visual sets leads to subsequent improvement of number acuity and reduces the influence of continuous visual cues.

A developmental approach for training deep belief networks

TLDR
iDBN, an iterative learning algorithm for DBNs that allows to jointly update the connection weights across all layers of the model, paves the way to the use of iDBN for modeling neurocognitive development.

Deep learning systems as complex networks

TLDR
This article proposes to study deep belief networks using techniques commonly employed in the study of complex networks, in order to gain some insights into the structural and functional properties of the computational graph resulting from the learning process.

Learning Numerosity Representations with Transformers: Number Generation Tasks and Out-of-Distribution Generalization

TLDR
It is shown that attention-based architectures operating at the pixel level can learn to produce well-formed images approximately containing a specific number of items, even when the target numerosity was not present in the training distribution.

References

SHOWING 1-10 OF 79 REFERENCES

Deep generative learning of location-invariant visual word recognition

TLDR
The results reveal that the efficient coding of written words—which was the model's learning objective—is largely based on letter-level information.

Emergence of simple-cell receptive field properties by learning a sparse code for natural images

TLDR
It is shown that a learning algorithm that attempts to find sparse linear codes for natural scenes will develop a complete family of localized, oriented, bandpass receptive fields, similar to those found in the primary visual cortex.

Deep Unsupervised Learning on a Desktop PC: A Primer for Cognitive Scientists

TLDR
It is shown how simulations of deep unsupervised learning can be easily performed on a desktop PC by exploiting the processors of low cost graphic cards without any specific programing effort, thanks to the use of high-level programming routines (available in MATLAB or Python).

Modeling language and cognition with deep unsupervised learning: a tutorial overview

TLDR
It is argued that the focus on deep architectures and generative (rather than discriminative) learning represents a crucial step forward for the connectionist modeling enterprise, because it offers a more plausible model of cortical learning as well as a way to bridge the gap between emergentist connectionist models and structured Bayesian models of cognition.

Learning Orthographic Structure With Sequential Generative Neural Networks

TLDR
This work investigates a sequential version of the restricted Boltzmann machine (RBM), a stochastic recurrent neural network that extracts high-order structure from sensory data through unsupervised generative learning and can encode contextual information in the form of internal, distributed representations.

Deep Learning of Representations for Unsupervised and Transfer Learning

  • Yoshua Bengio
  • Computer Science
    ICML Unsupervised and Transfer Learning
  • 2012
TLDR
Why unsupervised pre-training of representations can be useful, and how it can be exploited in the transfer learning scenario, where the authors care about predictions on examples that are not from the same distribution as the training distribution.

Predictive coding in the visual cortex: a functional interpretation of some extra-classical receptive-field effects.

TLDR
Results suggest that rather than being exclusively feedforward phenomena, nonclassical surround effects in the visual cortex may also result from cortico-cortical feedback as a consequence of the visual system using an efficient hierarchical strategy for encoding natural images.

Learning multiple layers of representation

...