Separating Style and Content with Bilinear Models

@article{Tenenbaum2000SeparatingSA,
  title={Separating Style and Content with Bilinear Models},
  author={J. Tenenbaum and W. Freeman},
  journal={Neural Computation},
  year={2000},
  volume={12},
  pages={1247-1283}
}
Perceptual systems routinely separate content from style, classifying familiar words spoken in an unfamiliar accent, identifying a font or handwriting style across letters, or recognizing a familiar face or object seen under unfamiliar viewing conditions. [...] Key Method We present a general framework for learning to solve two-factor tasks using bilinear models, which provide sufficiently expressive representations of factor interactions but can nonetheless be fit to data using efficient algorithms based on…Expand
Learning to Extract Parameterized Features by Predicting Transformed Images
TLDR
The transforming autoencoder is introduced, which is trained on pairs of images related by a transformation while having direct access to this underlying transformation that it would like to capture, and forced to take on a desired meaning without their values being explicitly specified. Expand
A New Bilinear Approach for Incremental Visual Learning and Recognition
  • H. Nakouri, M. Limam
  • Mathematics, Computer Science
  • Int. J. Pattern Recognit. Artif. Intell.
  • 2013
TLDR
This paper proposes a new incremental robust face recognition method based on separating the identity factor and the style factor using a symmetric bilinear approach, and achieves better performance in terms of recognition rate than other existing methods. Expand
Learned Factorization Models to Explain Variability in Natural Image Sequences
TLDR
The contribution of this work is a demonstration of an adaptive mechanism that can automatically learn transformations in a structured model, which enables sources of variability to be factored out by inverting it. Expand
Visual Representations for Fine-grained Categorization
  • Ning Zhang
  • Computer Science
  • 2015
TLDR
This work proposes pose-normalized representations, which align training exemplars, either piecewise by part or globally for the whole object, effectively factoring out differences in pose and in camera viewing angle, and introduces the part-based RCNN method as an extension of state-of-art object detection method RCNN for fine-grained categorization. Expand
Visual object recognition using generative models of images
TLDR
The main conclusion is that generative models are not only useful for recognition, but can even outperform purely discriminative models on difficult recognition tasks. Expand
Probabilistic bilinear models for appearance-based vision
TLDR
This work describes how learning the distributions using particle filters allows us to efficiently compute a probabilistic "novelty" term, and combines this approach with a new EM-based method for learning basis vectors that describe content-style mixing. Expand
What Are the Invariant Occlusive Components of Image Patches? A Probabilistic Generative Approach
TLDR
The results show that probabilistic models that capture occlusions and invariances can be trained efficiently on image patches, and that the resulting encoding represents an alternative model for the neural encoding of images in the primary visual cortex. Expand
Tied Factor Analysis for Face Recognition across Large Pose Differences
TLDR
A generative model that creates a one-to-many mapping from an idealized "identity" space to the observed data space to establish a probabilistic distance metric that allows a full posterior over possible matches to be established. Expand
Separating style and content on a nonlinear manifold
Bilinear and multi-linear models have been successful in decomposing static image ensembles into perceptually orthogonal sources of variations, e.g., separation of style and content. If we considerExpand
Weakly-supervised Compositional FeatureAggregation for Few-shot Recognition
TLDR
The simple yet powerful Compositional Feature Aggregation module is presented as a weakly-supervised regularization for deep networks that can be conveniently plugged into existing models for end-to-end optimization while keeping the model size and computation cost nearly the same. Expand
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 82 REFERENCES
Learning bilinear models for two-factor problems in vision
  • W. Freeman, J. Tenenbaum
  • Computer Science
  • Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition
  • 1997
TLDR
This paper shows how bilinear models can be used to learn the style-content structure of a pattern analysis or synthesis problem, which can then be generalized to solve related tasks using different styles and/or content. Expand
Separating Style and Content
TLDR
Blinear models are fitted with bilinear models which explicitly represent the two-factor structure, allowing them to solve three general tasks: extrapolation of a new style to unobserved content; classification of content observed in a newstyle; and translation of new content observation in anew style. Expand
A Simple Common Contexts Explanation for the Development of Abstract Letter Identities
TLDR
A self-organizing artificial neural network is presented that illustrates this idea and produces ALIs when presented with the most frequent words from a beginning reading corpus, as well as with artificial input. Expand
Parametric Hidden Markov Models for Gesture Recognition
TLDR
The approach is to extend the standard hidden Markov model method of gesture recognition by including a global parametric variation in the output probabilities of the HMM states by forming an expectation-maximization (EM) method for training the parametric HMM. Expand
Connectionist generalization for production: An example from GridFont
TLDR
A connectionist network is designed for generalization of production in such a way-to generate letterforms in a new font given just a few exemplars from that font. Expand
Eigenfaces for Recognition
TLDR
A near-real-time computer system that can locate and track a subject's head, and then recognize the person by comparing characteristics of the face to those of known individuals, and that is easy to implement using a neural network architecture. Expand
A low-dimensional representation of human faces for arbitrary lighting conditions
  • Peter W. Hallinan
  • Computer Science
  • 1994 Proceedings of IEEE Conference on Computer Vision and Pattern Recognition
  • 1994
TLDR
A low-dimensional model for human faces is proposed that can both synthesize a face image when given lighting conditions and can estimate lighting conditions when given a face images. Expand
RECOVERING INTRINSIC SCENE CHARACTERISTICS FROM IMAGES
We suggest that an appropriate role of early visual processing is to describe a scene in terms of intrinsic (vertical) characteristics -- such as range, orientation, reflectance, and incidentExpand
An Information-Maximization Approach to Blind Separation and Blind Deconvolution
TLDR
It is suggested that information maximization provides a unifying framework for problems in "blind" signal processing and dependencies of information transfer on time delays are derived. Expand
Statistical Approach to Shape from Shading: Reconstruction of Three-Dimensional Face Surfaces from Single Two-Dimensional Images
TLDR
It is suggested that the brain, through evolution or prior experience, has discovered that objects can be classified into lower-dimensional object-classes as to their shape, and extraction of shape from shading is then equivalent to the much simpler problem of parameter estimation in a low-dimensional space. Expand
...
1
2
3
4
5
...