Learn More
SUMMARY This paper describes a hands-free speech recognition technique based on acoustic model adaptation to reverberant speech. In hands-free speech recognition, the recognition accuracy is degraded by reverberation , since each segment of speech is affected by the reflection energy of the preceding segment. To compensate for the reflection signal we(More)
— We investigated a robust speech feature extraction method using kernel PCA (Principal Component Analysis) for distorted speech recognition. Kernel PCA has been suggested for various image processing tasks requiring an image model, such as denoising, where a noise-free image is constructed from a noisy input image [1]. Much research for robust speech(More)
This paper presents a voice conversion (VC) technique for noisy environments, where parallel exemplars are introduced to encode the source speech signal and synthesize the target speech signal. The parallel exemplars (dictionary) consist of the source exemplars and target exemplars, having the same texts uttered by the source and target speakers. The input(More)
This paper presents a voice conversion technique using Deep Belief Nets (DBNs) to build high-order eigen spaces of the source/target speakers, where it is easier to convert the source speech to the target speech than in the traditional cepstrum space. DBNs have a deep architecture that automatically discovers abstractions to maximally express the original(More)
—This paper presents a hands-free speech recognition method based on HMM composition and separation for speech contaminated not only by additive noise but also by an acoustic transfer function. The method realizes an improved user interface such that a user is not encumbered by microphone equipment in noisy and reverberant environments. The use of HMM(More)
Voice activity detection (VAD) plays an important role in speech processing including speech recognition, speech enhancement, and speech coding in noisy environments. We developed an evaluation framework for VAD in such environments, called Corpus and Environment for Noisy Speech Recognition 1 Concatenated (CENSREC-1-C). This framework consists of noisy(More)
This paper presents a robust speech recognition method based on the HMM composition for the noisy room acous-tics distorted speech. The method realizes an improved user interface such as the user is not encumbered by microphone equipments. The proposed HMM composition is obtained by naturally extending the HMM composition method of an additive noise to that(More)
For a hands-free speech interface, it is important to detect commands in spontaneous utterances. To discriminate commands from human-human conversations by acoustic features, it is efficient to consider the head and the tail of an utterance. The different characteristics of system requests and spontaneous utterances appear on these parts of an utterance.(More)