• Publications
  • Influence
Voice Conversion Based on Maximum-Likelihood Estimation of Spectral Parameter Trajectory
  • T. Toda, A. Black, K. Tokuda
  • Computer Science, Mathematics
    IEEE Transactions on Audio, Speech, and Language…
  • 1 November 2007
TLDR
Experimental results indicate that the performance of VC can be dramatically improved by the proposed method in view of both speech quality and conversion accuracy for speaker individuality.
Statistical mapping between articulatory movements and acoustic spectrum using a Gaussian mixture model
TLDR
Experimental results demonstrate that the MLE- based mapping with dynamic features can significantly improve the mapping performance compared with the MMSE-based mapping in both the articulatory-to-acoustic mapping and the inversion mapping.
A Speech Parameter Generation Algorithm Considering Global Variance for HMM-Based Speech Synthesis
TLDR
A generation algorithm considering not only the HMM likelihood maximized in the conventional algorithm but also a likelihood for a global variance of the generated trajectory works as a penalty for the over-smoothing.
Speech Synthesis Based on Hidden Markov Models
This paper gives a general overview of hidden Markov model (HMM)-based speech synthesis, which has recently been demonstrated to be very effective in synthesizing speech. The main advantage of this
Learning to Generate Pseudo-Code from Source Code Using Statistical Machine Translation (T)
TLDR
SMT, which was originally designed to translate between two natural languages, allows us to automatically learn the relationship between source code/pseudo-code pairs, making it possible to create a pseudo-code generator with less human effort.
Details of the Nitech HMM-Based Speech Synthesis System for the Blizzard Challenge 2005
TLDR
The technical details, building processes, and performance of the basic HMM-based speech synthesis system, and new features integrated into Nitech-HTS 2005 such as STRAIGHT-based vocoding, HSMM- based acoustic modeling, and a speech parameter generation algorithm considering GV are described.
Speaker-Dependent WaveNet Vocoder
TLDR
A speaker-dependent WaveNet vocoder is proposed, a method of synthesizing speech waveforms with WaveNet, by utilizing acoustic features from existing vocoder as auxiliary features of WaveNet.
Voice conversion algorithm based on Gaussian mixture model with dynamic frequency warping of STRAIGHT spectrum
TLDR
Results of the evaluation experiments clarify that the converted speech quality is better than that of the GMM-based algorithm, and the conversion-accuracy on speaker individuality is the same as that of this proposed method with the properly-weighted residual spectrum.
The Voice Conversion Challenge 2018: Promoting Development of Parallel and Nonparallel Methods
TLDR
A brief summary of the state-of-the-art techniques for VC is presented, followed by a detailed explanation of the challenge tasks and the results that were obtained.
An investigation of multi-speaker training for wavenet vocoder
TLDR
The experimental results demonstrate that 1) the multispeaker WaveNet vocoder still outperforms STRAIGHT in generating known speakers' voices but it is comparable to STRAight in generating unknown speaker's voices, and 2) the multi-speaker training is effective for developing the Wave net vocoder capable of speech modification.
...
1
2
3
4
5
...