Learn More
This paper addresses the problem of voice activity detection (VAD) in noisy environments. The VAD method proposed in this paper integrates multiple speech features and a signal decision scheme, namely the speech periodic to aperiodic component ratio and a switching Kalman filter. The integration is carried out by using the weighted sum of likelihoods(More)
This paper presents a realtime system for analyzing group meetings that uses a novel omnidirectional camera-microphone system. The goal is to automatically discover the visual focus of attention (VFOA), i.e. "who is looking at whom", in addition to speaker diarization, i.e. "who is speaking and when". First, a novel tabletop sensing device for round-table(More)
This paper describes systems for the enhancement and recognition of distant speech recorded in reverberant rooms. Our speech enhancement (SE) system handles reverberation with blind deconvolution using linear filtering estimated by exploiting the temporal correlation of observed reverberant speech signals. Additional noise reduction is then performed using(More)
In this paper, we introduce a system for recognizing speech in the presence of multiple rapidly time-varying noise sources. The main components of the proposed approach are a model-based speech enhancement pre-processor and an adaptation technique to optimize the integration between the pre-processor and the recognizer. The speech enhancement pre-processor(More)
This paper addresses a speech recognition problem in non-stationary noise environments: the estimation of noise sequences. To solve this problem, we present a particle filter-based sequential noise estimation method for the front-end processing of speech recognition. In the proposed method, the particle filter is defined by a dynamical system based on(More)
This paper addresses the main speech recognition problem in non-stationary noise environments: the estimation of noise sequences. To solve this problem, we present a particle filter-based sequential noise estimation method for front-end processing of speech recognition in noise. In the proposed method, a noise sequence is estimated through a sequential(More)
In this paper, we propose a noise robust speech recognition method by combination of temporal domain singular value de-composition(SVD) based speech enhancement and Gaussian mixture model(GMM) based speech estimation. The bottleneck of GMM based approach is a noise estimation problem. For this noise estimation problem, we incorporated the adaptive noise(More)