Damien Lolive

Learn More
In a voice transformation context, prosody transformation using parallel corpora is quite unrealistic as such corpora are difficult and also expensive to build. Based on this observation, we propose an approach for transforming prosody using non-parallel corpora thanks to the MLLR adaptation strategy. This methodology is applied to the joint transformation(More)
Building speech corpora is a first and crucial step for every text-to-speech synthesis system. Nowadays, the use of statistical models implies the use of huge sized corpora that need to be recorded, transcribed, annotated and segmented to be usable. The variety of corpora necessary for recent applications (content, style, etc.) makes the use of existing(More)
This article describes a new approach to estimate F 0 curves using a B-Spline model characterized by a knot sequence and associated control points. The free parameters of the model are the number of knots and their location. The free-knot placement , which is a NP-hard problem, is done using a global MLE within a simulated-annealing strategy. The optimal(More)
This article describes a new approach to estimate F 0 curves using a B-Spline model characterized by a knot sequence and associated control points. The free parameters of the model are the number of knots and their location. The free-knot placement, which is a NP-hard problem, is done using a global MLE within a simulated-annealing strategy. The optimal(More)
This article describes a new unsupervised methodology to learn F0 classes using HMM on a syllable basis. A F0 class is represented by a HMM with three emitting states. The unsupervised clustering algorithm relies on an iterative gaussian splitting and EM retraining process. First, a single class is learnt on a training corpus (8000 syllables) and it is then(More)
This article describes a new unsupervised methodology to learn F0 classes using HMM models on a syllable basis. A F0 class is represented by a HMM with three emitting states. The clustering algorithm relies on an iterative gaussian splitting and EM retraining process. First, a single class is learnt on a training corpus (8000 syllables) and it is then(More)