Learn More
This paper addresses the problem of voice activity detection (VAD) in noisy environments. The VAD method proposed in this paper integrates multiple speech features and a signal decision scheme, namely the speech periodic to aperiodic component ratio and a switching Kalman filter. The integration is carried out by using the weighted sum of likelihoods(More)
This paper presents a realtime system for analyzing group meetings that uses a novel omnidirectional camera-microphone system. The goal is to automatically discover the visual focus of attention (VFOA), i.e. "who is looking at whom", in addition to speaker diarization, i.e. "who is speaking and when". First, a novel tabletop sensing device for round-table(More)
In this paper, we introduce a system for recognizing speech in the presence of multiple rapidly time-varying noise sources. The main components of the proposed approach are a model-based speech enhancement pre-processor and an adaptation technique to optimize the integration between the pre-processor and the recognizer. The speech enhancement pre-processor(More)
This paper addresses a speech recognition problem in non-stationary noise environments: the estimation of noise sequences. To solve this problem, we present a particle filter-based sequential noise estimation method for the front-end processing of speech recognition. In the proposed method, the particle filter is defined by a dynamical system based on(More)
In this paper, we propose a noise robust speech recognition method by combination of temporal domain singular value de-composition(SVD) based speech enhancement and Gaussian mixture model(GMM) based speech estimation. The bottleneck of GMM based approach is a noise estimation problem. For this noise estimation problem, we incorporated the adaptive noise(More)
Voice activity detection (VAD) systems have been the object of continuous research during the last three decades. While single microphone systems cannot take advantage of certain spatial properties of speech signals, microphone array systems consisting of many elements based on beamforming techniques can be difficult to implement in reality due to cost and(More)
This paper addresses the problem of voice activity detection in noise environments. The proposed voice activity detection technique described in this paper is based on a statistical model approach, and estimates the statistical models sequentially without a prior knowledge of noise. The crucial factor as regards the statistical model-based approach is noise(More)