Ken'ichi Kumatani

Learn More
In this paper, we consider an acoustic beamforming application where two speakers are simultaneously active. We construct one subband-domain beamformer in generalized sidelobe canceller (GSC) configuration for each source. In contrast to normal practice, we then jointly optimize the active weight vectors of both GSCs to obtain two output signals with(More)
Distant speech recognition (DSR) holds the promise of the most natural human computer interface because it enables man-machine interactions through speech, without the necessity of donning intrusive body- or head-mounted microphones. Recognizing distant speech robustly, however, remains a challenge. This contribution provides a tutorial overview of DSR(More)
This paper presents new superdirective beamforming algorithms based on the maximum negentropy (MN) criterion for distant automatic speech recognition. The MN beamformer is configured in the generalized sidelobe canceler structure, and uses the weights derived from a delay-and-sum beamformer as the quiescent weight vector. While satisfying the distortionless(More)
In this paper, we address a beamforming application based on the capture of far-field speech data from a single speaker in a real meeting room. After the position of the speaker is estimated by a speaker tracking system, we construct a subband-domain beamformer in generalized sidelobe canceller (GSC) configuration. In contrast to conventional practice, we(More)
Recently demands for Audio-visual Speech Recognition (AVSR) has been increased in order to make the speech recognition system robust to acoustic noise. There are two kinds of research issues in the audio-visual speech recognition research such as integration modeling considering asynchronicity between modalities and adaptive information weighting according(More)
This paper describes the 2006 lecture recognition system developed at the Interactive Systems Laboratories (ISL), for individual head-microphone (IHM), single distant microphone (SDM), and multiple distant microphones (MDM) conditions. It was evaluated in RT-06S rich transcription meeting evaluation sponsored by the US National Institute of Standards and(More)
This paper presents new filter bank design methods for sub- band adaptive beamforming. In this work, we design analysis and synthesis prototypes for modulated filter banks so as to minimize each aliasing term individually. We then drive the total response error to null by constraining these prototypes to be Nyquist(M) filters. Thereafter those modulated(More)
Distant speech recognition (DSR) holds out the promise of providing a natural human computer interface in that it enables verbal interactions with computers without the necessity of donning intrusive body- or head-mounted devices. Recognizing distant speech robustly, however, remains a challenge. This paper provides a overview of DSR systems based on(More)
This paper presents an adaptive beamforming application based on the capture of far-field speech data from a real single speaker in a real meeting room. After the position of a speaker is estimated by a speaker tracking system, we construct a subbanddomain beamformer in generalized sidelobe canceller (GSC) configuration. In contrast to conventional(More)
Distant speech recognition (DSR) holds out the promise of the most natural human computer interface because it enables man-machine interactions through speech, without the necessity of donning intrusive body- or head-mounted microphones. With the advent of the Microsoft Kinect, the application of non-uniform linear arrays to the DSR problem has become(More)