Learning recognition and segmentation of 3-D objects from 2-D images

  title={Learning recognition and segmentation of 3-D objects from 2-D images},
  author={Juyang Weng and Narendra Ahuja and Thomas S. Huang},
  journal={1993 (4th) International Conference on Computer Vision},
A framework called Cresceptron is introduced for automatic algorithm design through learning of concepts and rules, thus deviating from the traditional mode in which humans specify the rules constituting a vision algorithm. [...] Key Method The Cresceptron uses a hierarchical structure to grow networks automatically, adaptively, and incrementally through learning. The Cresceptron makes it possible to generalize training exemplars to other perceptually equivalent items. Experiments with a variety of real-world…Expand
Learning Recognition and Segmentation Using the Cresceptron
This paper presents a framework called Cresceptron, which recognizes and segments image patterns that are similar to those learned, using a stochastic distortion model and view-based interpolation, allowing other view points that are moderately different from those used in learning.
An Automated Perceptual Learning Algorithm for Determining Structure-Based Visual Prototypes of Objects from Internet-Scale Data
This dissertation investigated the open problem of constructing part-based object representation models from very large scale image databases in an unsupervised manner and defined a network model from a full Bayesian setting that is able to find visual templates of the same part with dramatically different visual appearances.
Learning object recognition models from images
  • Arthur R. Pope, D. Lowe
  • Mathematics, Computer Science
    1993 (4th) International Conference on Computer Vision
  • 1993
The authors show how to learn a model from a series of training images depicting a class of objects, producing a model that represents a probability distribution over the variation in object appearance that can recognize objects as similar in general appearance while distinguishing them by their detailed features.
SHOSLIF: a framework for object recognition from images
  • J. Weng
  • Computer Science
    Proceedings of 1994 IEEE International Conference on Neural Networks (ICNN'94)
  • 1994
A new framework called self-organizing hierarchical optimal subspace learning and inference framework (SHOSLIF) is introduced for recognizing and segmenting real-world objects from images. It
Detection of 3D objects in cluttered scenes using hierarchical eigenspace
A novel method to detect three-dimensional objects in arbitrary poses and sizes from a complex image and to simultaneously measure their poses and size using appearance matching is proposed.
A neural-network appearance-based 3-D object recognition using independent component analysis
This paper presents results on appearance-based three-dimensional (3-D) object recognition (3DOR) accomplished by utilizing a neural-network architecture developed based on independent component analysis (ICA), suggesting that the use of ICA may not necessarily always give better results than PCA, and that the application of I CA is highly data dependent.
Cresceptron and Shoslif: toward Comprehensive Visual Learning 1
Comprehensive visual learning concerns a uniied theory and methodology for computer vision systems to comprehensively learn the visual world with only minimal hand-crafted rules about the world. This
Hierarchical Discriminant Analysis for Image Retrieval
  • D. Swets, J. Weng
  • Computer Science
    IEEE Trans. Pattern Anal. Mach. Intell.
  • 1999
The self-organizing hierarchical optimal subspace learning and inference framework (SHOSLIF) system uses the theories of optimal linear projection for optimal feature derivation and a hierarchical structure to achieve logarithmic retrieval complexity.
A hierarchical active binocular robot vision architecture for scene exploration and object appearance learning
This thesis presents an investigation of a computational model of hierarchical visual behaviours within an active binocular robot vision architecture. The robot vision system is able to localise
Model-based learning of segmentations
  • A. Hoogs, R. Bajcsy
  • Computer Science
    Proceedings of 13th International Conference on Pattern Recognition
  • 1996
Results indicate that the inclusion of the segmentation information significantly improves pose adjustment accuracy over using purely geometric information for model appearance.


A network that learns to recognize three-dimensional objects
A scheme is developed, based on the theory of approximation of multivariate functions, that learns from a small set of perspective views a function mapping any viewpoint to a standard view, and a network equivalent to this scheme will 'recognize' the object on which it was trained from any viewpoint.
Perceiving shape from shading.
The prevalence of countershading in a variety of species, including many fishes, suggests that shading may be a crucial source of information about three-dimensional shape.
Neocognitron: A neural network model for a mechanism of visual pattern recognition
A recognition with a large-scale network is simulated on a PDP-11/34 minicomputer and is shown to have a great capability for visual pattern recognition and can be trained to recognize handwritten Arabic numerals even with considerable deformations in shape.
Size in the visual processing of faces and words.
We studied the influence of variations of stimulus size upon recognition of words and faces. Size played an important role in the recognition of faces but was irrelevant to the recognition of words,
Some results on translation invariance in the human visual system.
The result suggests that the visual system does not apply a global transposition transformation to the retinal image to compensate for translations, and it is proposed that the image decomposes the image into simple features which themselves are more-or-less translation invariant.
An introduction to computing with neural nets
This paper provides an introduction to the field of artificial neural nets by reviewing six important neural net models that can be used for pattern classification and exploring how some existing classification and clustering algorithms can be performed using simple neuron-like components.
Learning and memory : a biological view
Parametrization for data approximation, L. Alt a vector spline approximation with application to meteorology, L. Amodei and M.N. Benbourhim kernel estimation in change-point hazard rate models, A.
ALVINN: An Autonomous Land Vehicle in a Neural Network
ALVINN (Autonomous Land Vehicle In a Neural Network) is a 3-layer back-propagation network designed for the task of road following that can effectively follow real roads under certain field conditions.
Size in the visual processing of faces and words.
Analysis of variations of stimulus size revealed that although irrelevant to recognition, size of words was nevertheless encoded, with some consequences similar to those for recognition of faces.
Eye, brain, and vision
This work examines the mechanisms by which we perceive colour, depth and movement, and the function of the fibres connecting the two halves of the brain. The author describes how the visual circuits