• Publications
  • Influence
Phoneme recognition using time-delay neural networks
TLDR
The authors present a time-delay neural network approach to phoneme recognition which is characterized by two important properties: (1) using a three-layer arrangement of simple computing units, a hierarchy can be constructed that allows for the formation of arbitrary nonlinear decision surfaces, which the TDNN learns automatically using error backpropagation; and (2) the network to discover acoustic-phonetic features and the temporal relationships between them independently of position in time. Expand
  • 2,532
  • 157
  • PDF
Julius - an open source real-time large vocabulary recognition engine
TLDR
EUROSPEECH2001: the 7th European Conference on Speech Communication and Technology, September 3-7, 2001, Aalborg, Denmark. Expand
  • 605
  • 52
  • PDF
Voice conversion through vector quantization
TLDR
The authors propose a new voice conversion technique through vector quantization and spectrum mapping which allows to precisely control voice individuality. Expand
  • 551
  • 35
Blind Source Separation Combining Independent Component Analysis and Beamforming
TLDR
We describe a new method of blind source separation (BSS) on a microphone array combining subband independent component analysis (ICA) and beamforming. Expand
  • 196
  • 17
  • PDF
JNAS: Japanese speech corpus for large vocabulary continuous speech recognition research
TLDR
We present the first public Japanese speech corpus for large vocabulary continuous speech recognition, which we have titled JNAS (Japanese Newspaper Article Sentences). Expand
  • 238
  • 16
Voice conversion algorithm based on Gaussian mixture model with dynamic frequency warping of STRAIGHT spectrum
TLDR
In the voice conversion algorithm based on the Gaussian Mixture Model (GMM) applied to STRAIGHT, quality of converted speech is degraded because the converted spectrum is exceedingly smooth. Expand
  • 153
  • 13
Statistical Voice Conversion Techniques for Body-Conducted Unvoiced Speech Enhancement
TLDR
We present statistical voice conversion (VC) methods for enhancing body-conducted unvoiced speech detected with a nonaudible murmur (NAM) microphone. Expand
  • 136
  • 9
Lip movement synthesis from speech based on Hidden Markov Models
TLDR
This paper proposes a novel, lip movement synthesis method from speech input based on the Hidden Markov Models (HMMs). Expand
  • 125
  • 9
One-to-Many and Many-to-One Voice Conversion Based on Eigenvoices
TLDR
This paper describes two flexible frameworks of voice conversion (VC), i.e., one- to-many VC and many-to-one VC. Expand
  • 93
  • 8
Evaluation of cross-language voice conversion based on GMM and straight
TLDR
EUROSPEECH2001: the 7th European Conference on Speech Communication and Technology, September 3-7, 2001, Aalborg, Denmark. Expand
  • 64
  • 8
  • PDF