Voice Conversion Based on Maximum-Likelihood Estimation of Spectral Parameter Trajectory
Experimental results indicate that the performance of VC can be dramatically improved by the proposed method in view of both speech quality and conversion accuracy for speaker individuality.
Statistical Parametric Speech Synthesis
Unit selection in a concatenative speech synthesis system using a large speech database
It is proposed that the units in a synthesis database can be considered as a state transition network in which the state occupancy cost is the distance between a database unit and a target, and the transition cost is an estimate of the quality of concatenation of two consecutive units.
Finding Function in Form: Compositional Character Models for Open Vocabulary Word Representation
- Wang Ling, Chris Dyer, Tiago Luís
- Computer ScienceConference on Empirical Methods in Natural…
- 9 August 2015
A model for constructing vector representations of words by composing characters using bidirectional LSTMs that requires only a single vector per character type and a fixed set of parameters for the compositional model, which yields state- of-the-art results in language modeling and part-of-speech tagging.
The HMM-based speech synthesis system (HTS) version 2.0
This paper describes HTS version 2.0 in detail, as well as future release plans, which include a number of new features which are useful for both speech synthesis researchers and developers.
Style Transfer Through Back-Translation
- Shrimai Prabhumoye, Yulia Tsvetkov, R. Salakhutdinov, A. Black
- Computer ScienceAnnual Meeting of the Association for…
- 24 April 2018
A latent representation of the input sentence is learned which is grounded in a language translation model in order to better preserve the meaning of the sentence while reducing stylistic properties, and adversarial generation techniques are used to make the output match the desired style.
Two/Too Simple Adaptations of Word2Vec for Syntax Problems
- Wang Ling, Chris Dyer, A. Black, I. Trancoso
- Computer ScienceNorth American Chapter of the Association for…
We present two simple modifications to the models in the popular Word2Vec tool, in order to generate embeddings more suited to tasks involving syntax. The main issue with the original models is the…
The CMU Arctic speech databases
The CMU Arctic databases designed for the purpose of speech synthesis research, which consist of approximately 1200 phonetically balanced English utterances, are distributed as free software, without restriction on commercial or non-commercial use.
A Dataset for Document Grounded Conversations
- Kangyan Zhou, Shrimai Prabhumoye, A. Black
- Computer Science, HistoryConference on Empirical Methods in Natural…
- 19 September 2018
This paper describes two neural architectures that provide benchmark performance on the task of generating the next response and finds that the information from the document helps in generating more engaging and fluent responses.