Learn More
Detecting abnormalities in video is a challenging problem since the class of all irregular objects and behaviors is infinite and thus no (or by far not enough) abnormal training samples are available. Consequently, a standard setting is to find abnormalities without actually knowing what they are because we have not been shown abnormal examples during(More)
Object detection in cluttered, natural scenes has a high complexity since many local observations compete for object hypotheses. Voting methods provide an efficient solution to this problem. When Hough voting is extended to location and scale, votes naturally become lines through scale space due to the local scale-location-ambiguity. In contrast to this,(More)
Real-world scene understanding requires recognizing object categories in novel visual scenes. This paper describes a composition system that automatically learns structured, hierarchical object representations in an unsupervised manner without requiring manual segmentation or manual object localization. A central concept for learning object models in the(More)
The complexity of real world image categorization and scene analysis requires compositional strategies for object representation. This contribution establishes a compositional hierarchy by first performing a perceptual bottom-up grouping of edge pixels to generate salient contour curves. A subsequent recursive top-down grouping yields a hierarchy of(More)
The arrival of appearance-based image features has dramatically influenced the field of visual object recognition. Previous work has shown, however, that contour curvature and junctions are important for shape representation and detection. We investigate a local representation of contours for object detection that complements appearance-based information,(More)
We present a method for finding correlated components in audio and video signals. The new technique is applied to the task of identifying sources in video and separating them in audio. The concept of canonical correlation analysis is reformulated such that it incorporates nonnegativity and sparsity constraints on the coefficients of projection directions.(More)
The compositional nature of visual objects significantly limits their representation complexity and renders learning of structured object models tractable. Adopting this modeling strategy we both (i) automatically decompose objects into a hierarchy of relevant compositions and we (ii) learn such a compositional representation for each category without(More)