Corpus ID: 218630265

A Deeper Look at the Unsupervised Learning of Disentangled Representations in β-VAE from the Perspective of Core Object Recognition

@article{Sikka2020ADL,
  title={A Deeper Look at the Unsupervised Learning of Disentangled Representations in $\beta$-VAE from the Perspective of Core Object Recognition},
  author={Harshvardhan Digvijay Sikka},
  journal={ArXiv},
  year={2020},
  volume={abs/2005.07114}
}
The ability to recognize objects despite there being differences in appearance, known as Core Object Recognition, forms a critical part of human perception. While it is understood that the brain accomplishes Core Object Recognition through feedforward, hierarchical computations through the visual stream, the underlying algorithms that allow for invariant representations to form downstream is still not well understood. (DiCarlo et al., 2012) Various computational perceptual models have been… Expand

References

SHOWING 1-10 OF 60 REFERENCES
Deep Neural Networks Rival the Representation of Primate IT Cortex for Core Visual Object Recognition
TLDR
These evaluations show that, unlike previous bio-inspired models, the latest DNNs rival the representational performance of IT cortex on this visual object recognition task and propose an extension of “kernel analysis” that measures the generalization accuracy as a function of representational complexity. Expand
Challenging Common Assumptions in the Unsupervised Learning of Disentangled Representations
TLDR
This paper theoretically shows that the unsupervised learning of disentangled representations is fundamentally impossible without inductive biases on both the models and the data, and trains more than 12000 models covering most prominent methods and evaluation metrics on seven different data sets. Expand
Comparing deep neural networks against humans: object recognition when the signal gets weaker
TLDR
The human visual system is found to be more robust to image manipulations like contrast reduction, additive noise or novel eidolon-distortions than deep neural networks, indicating that there may still be marked differences in the way humans and current DNNs perform visual object recognition. Expand
beta-VAE: Learning Basic Visual Concepts with a Constrained Variational Framework
Learning an interpretable factorised representation of the independent data generative factors of the world without supervision is an important precursor for the development of artificialExpand
Learning Disentangled Representations with Semi-Supervised Deep Generative Models
TLDR
This work proposes to learn disentangled representations that encode distinct aspects of the data into separate variables using model architectures that generalise from standard VAEs, employing a general graphical model structure in the encoder and decoder. Expand
Performance-optimized hierarchical models predict neural responses in higher visual cortex
TLDR
This work uses computational techniques to identify a high-performing neural network model that matches human performance on challenging object categorization tasks and shows that performance optimization—applied in a biologically appropriate model class—can be used to build quantitative predictive models of neural processing. Expand
Invariant Visual Object and Face Recognition: Neural and Computational Bases, and a Model, VisNet
  • E. Rolls
  • Computer Science, Medicine
  • Front. Comput. Neurosci.
  • 2012
TLDR
A feature hierarchy model in which invariant representations can be built by self-organizing learning based on the temporal and spatial statistics of the visual input produced by objects as they transform in the world is described. Expand
Towards a Definition of Disentangled Representations
TLDR
It is suggested that those transformations that change only some properties of the underlying world state, while leaving all other properties invariant are what gives exploitable structure to any kind of data. Expand
A Closer Look at Disentangling in β-VAE
TLDR
This work examines a generalization of the Variational Autoencoder, β-VAE, for learning disentangled representations using variational inference and shows that this incompatibility leads to a non-monotonic inference performance in β -VAE with a finite optimal β. Expand
Comparing state-of-the-art visual features on invariant object recognition tasks
TLDR
It is reported that most of these representations perform poorly on invariance recognition, but that one representation shows significant performance gains over two baseline representations, and it is shown how this approach can more deeply illuminate the strengths and weaknesses of different visual representations and thus guide progress on invariant object recognition. Expand
...
1
2
3
4
5
...