Learn More
Subspace segmentation is the task of segmenting data lying on multiple linear subspaces. Its applications in computer vision include motion segmentation in video, structure-from-motion, and image clustering. In this work, we describe a novel approach for subspace segmentation that uses probabilistic inference via a message-passing algorithm. We cast the(More)
In this paper, we present a large database of over 50,000 user-labeled videos collected from YouTube. We develop a compact representation called "tiny videos" that achieves high video compression rates while retaining the overall visual appearance of the video as it varies over time. We show that frame sampling using affinity propagation-an exemplar-based(More)
A new approach to sound localization, known as enhanced sound localization, is introduced, offering two major benefits over state-of-the-art algorithms. First, higher localization accuracy can be achieved compared to existing methods. Second, an estimate of the source orientation is obtained jointly, as a consequence of the proposed sound localization(More)
This paper proposes a new technique for face detection and lip feature extraction. A real-time field-programmable gate array (FPGA) implementation of the two proposed techniques is also presented. Face detection is based on a naive Bayes classifier that classifies an edge-extracted representation of an image. Using edge representation significantly reduces(More)
A variational inference algorithm for robust speech separation , capable of recovering the underlying speech sources even in the case of more sources than microphone observations , is presented. The algorithm is based upon an gen-erative probabilistic model that fuses time-delay of arrival (TDOA) information with prior information about the speakers and(More)
This paper presents a general method for the integration of distributed microphone arrays for localization of a sound source. The recently proposed sound localization technique, known as SRP-PHAT, is shown to be a special case of the more general microphone array integration mechanism presented here. The proposed technique utilizes spatial likelihood(More)
A dual-microphone speech-signal enhancement algorithm, utilizing phase-error based filters that depend only on the phase of the signals, is proposed. This algorithm involves obtaining time-varying, or alternatively, time-frequency (TF), phase-error filters based on prior knowledge regarding the time difference of arrival (TDOA) of the speech source of(More)