Benjamin Picart

Learn More
This paper presents the large audiovisual laughter database recorded as part of the AVLaughterCycle project held during the eNTERFACE’09 Workshop in Genova. 24 subjects participated. The freely available database includes audio signal and video recordings as well as facial motion tracking, thanks to markers placed on the subjects’ face. Annotations of the(More)
In this report, we present a Max/MSP external for real-time speech synthesis. Statistical parametric speech synthesis, based on Hidden Markov Models has been demonstrated to be very effective in synthesizing high-quality, natural and expressive speech. This technique is also able to provide high flexibility as a speech production model and a small database(More)
In this paper, we present a modified version of HTS, called performative HTS or pHTS. The objective of pHTS is to enhance the control ability and reactivity of HTS. pHTS reduces the phonetic context used for training the models and generates the speech parameters within a 2-label window. Speech waveforms are generated on-the-fly and the models can be(More)
Hypo and hyperarticulation refer to the production of speech with respectively a reduction and an increase of the articulatory efforts compared to the neutral style. Produced consciously or not, these variations of articulatory efforts depend upon the surrounding environment, the communication context and the motivation of the speaker with regard to the(More)
This paper focuses on the analysis and synthesis of hypo and hyperarticulated speech in the framework of HMM-based speech synthesis. First of all, a new French database matching our needs was created, which contains three identical sets, pronounced with three different degrees of articulation: neutral, hypo and hyperarticulated speech. On that basis,(More)
Class posterior distributions have recently been used quite successfully in Automatic Speech Recognition (ASR), either for frame or phone level classification or as acoustic features, which can be further exploited (usually after some “ad hoc” transformations) in different classifiers (e.g., in Gaussian Mixture based HMMs). In the present(More)
This paper focuses on the implementation of a continuous control of the degree of articulation (hypo/hyperarticulation) in the framework of HMM-based speech synthesis. The adaptation of a neutral speech synthesizer to generate hypo and hyperarticulated speech using a limited amount of speech data is first studied. This is done using inter-speaker voice(More)
The AVLaughterCycle project aims at developing an audiovisual laughing machine, capable of recording the laughter of a user and to respond to it with a machine-generated laughter linked with the input laughter. During the project, an audiovisual laughter database was recorded, including facial points tracking, thanks to the Smart Sensor Integration software(More)
This paper proposes a new prosody annotation protocol specific to live sports commentaries. Two levels of annotation are defined with HMM-based speech synthesis in view. Local labels are assigned to all syllables and refer to accentual phenomena. Global labels classify sequences of words into five distinct subgenres, defined in terms of valence and arousal.(More)
Class posterior distributions can be used to classify or as intermediate features, which can be further exploited in different classifiers (e.g., Gaussian Mixture Models, GMM) towards improving speech recognition performance. In this paper we examine the possibility to use kNN classifier to perform local phonetic classification of class posterior(More)