• Publications
  • Influence
Phoneme recognition using time-delay neural networks
The authors present a time-delay neural network (TDNN) approach to phoneme recognition which is characterized by two important properties: (1) using a three-layer arrangement of simple computing
Julius - an open source real-time large vocabulary recognition engine
EUROSPEECH2001: the 7th European Conference on Speech Communication and Technology, September 3-7, 2001, Aalborg, Denmark.
Voice conversion through vector quantization
TLDR
The authors propose a new voice conversion technique through vector quantization and spectrum mapping which makes it possible to precisely control voice individuality.
JNAS: Japanese speech corpus for large vocabulary continuous speech recognition research
TLDR
The first public Japanese speech corpus for large vocabulary continuous speech recognition (LVCSR) technology, which is titled JNAS (Japanese Newspaper Article Sentences), designed to be comparable to the corpora used in the American and European LVCSR projects.
Blind Source Separation Combining Independent Component Analysis and Beamforming
TLDR
It is evident that the performance of the proposed method in terms of the word recognition rates is superior to those of the conventional ICA-based BSS method under all reverberant conditions.
Statistical Voice Conversion Techniques for Body-Conducted Unvoiced Speech Enhancement
TLDR
Voice conversion methods from NAM to normal speech and to a whispered voice (NAM-to-Whisper) are proposed, where the acoustic features of body-conducted unvoiced speech are converted into those of natural voices in a probabilistic manner using Gaussian mixture models (GMMs).
Lip movement synthesis from speech based on Hidden Markov Models
TLDR
In subjective evaluation experiments, although differences in the audio-visual intelligibility between the synthesized lip parameters and the original ones are insignificant, the acceptability test to evaluate naturalness reflects the results of the objective evaluation.
One-to-Many and Many-to-One Voice Conversion Based on Eigenvoices
TLDR
This paper applies eigenvoice conversion (EVC) to both VC frameworks, i.e., one-to-many VC and many- to-one VC, and results of various experimental evaluations demonstrate the effectiveness of the proposed VC frameworks.
Evaluation of cross-language voice conversion based on GMM and straight
EUROSPEECH2001: the 7th European Conference on Speech Communication and Technology, September 3-7, 2001, Aalborg, Denmark.
ATR Japanese speech database as a tool of speech recognition and synthesis
TLDR
A large-scale Japanese speech database has been described and has been used to develop algorithms in speech recognition and synthesis studies and to find acoustic, phonetic and linguistic evidence that will serve as basic data for speech technologies.
...
1
2
3
4
5
...