• Publications
  • Influence
Voice conversion through vector quantization
The authors propose a new voice conversion technique through vector quantization and spectrum mapping which makes it possible to precisely control voice individuality. Expand
Learning to Generate Pseudo-Code from Source Code Using Statistical Machine Translation (T)
SMT, which was originally designed to translate between two natural languages, allows us to automatically learn the relationship between source code/pseudo-code pairs, making it possible to create a pseudo-code generator with less human effort. Expand
Acoustical Sound Database in Real Environments for Sound Scene Understanding and Hands-Free Speech Recognition
LREC2000: the 2nd International Conference on Language Resources and Evaluation, May 31 - June 2, 2000, Athens, Greece.
Guiding Neural Machine Translation with Retrieved Translation Pieces
This paper proposes a simple, fast, and effective method for recalling previously seen translation examples and incorporating them into the NMT decoding process, and compares favorably to another alternative retrieval-based method with respect to accuracy, speed, and simplicity of implementation. Expand
Listening while speaking: Speech chain by deep learning
This work develops the first deep learning model that integrates human speech perception and production behaviors and shows that the proposed approach significantly improved the performance more than separate systems that were only trained with labeled data. Expand
Optimizing Segmentation Strategies for Simultaneous Speech Translation
A method based on greedy search and dynamic programming that search for the optimal segmentation strategy for simultaneous speech translation finds a segmentation that directly maximizes the performance of the machine translation system. Expand
Lip movement synthesis from speech based on Hidden Markov Models
In subjective evaluation experiments, although differences in the audio-visual intelligibility between the synthesized lip parameters and the original ones are insignificant, the acceptability test to evaluate naturalness reflects the results of the objective evaluation. Expand
Cepstrum derived from differentiated power spectrum for robust speech recognition
The proposed feature set embedded with a nonlinear liftering transformation is quite effective for robust speech recognition and can be decomposed as the superposition of the standard cepstrum and its nonlinearly liftered counterpart. Expand
Incorporating Discrete Translation Lexicons into Neural Machine Translation
A method to calculate the lexicon probability of the next word in the translation candidate by using the attention vector of the NMT model to select which source word lexical probabilities the model should focus on is described. Expand
CENSREC-1-C: An evaluation framework for voice activity detection under noisy environments
Voice activity detection (VAD) plays an important role in speech processing including speech recognition, speech enhancement, and speech coding under noisy environments. We have developed anExpand