Learn More
A new voice conversion method that improves the quality of the voice conversion output at higher sampling rates is proposed. Speaker Transformation Algorithm Using Segmental Codebooks (STASC) is modified to process source and target speech spectra in different subbands. The new method ensures better conversion at sampling rates above 16KHz. Discrete Wavelet(More)
In this paper we present two virtual characters in an interactive poker game using RFID-tagged poker cards for the interaction. To support the game creation process, we have combined models, methods, and technology that are currently investigated in the ECA research field in a unique way. A powerful and easy-to-use multimodal dialog authoring tool is used(More)
The present paper reports on the DFKI entry to the Blizzard challenge 2008. The main difference of our system compared to last year is a new join model inspired by last year’s iFlytek paper; the effect seems small, but measurable in the sense that it leads to the selection of longer chunks of consecutive units. In interpreting the results of the listening(More)
We report on a multilingual comparison study on the effects of prosodic changes on emotional speech. The study was conducted in France, Germany, Greece and Turkey. Semantically identical sentences expressing emotional relevant content were translated into the target languages and were manipulated systematically with respect to pitch range, duration model,(More)
Synthesizing desired emotions using concatenative algorithms relies on collection of large databases. This paper focuses on the development and assessment of a simple algorithm to interpolate the intended vocal effort in existing databases in order to create new databases with intermediate levels of vocal effort. Three diphone databases in German with soft,(More)
This study proposes two new methods for detailed modeling and transformation of the vocal tract spectrum and the pitch contour. The first method (selective pre-emphasis) relies on band-pass filtering to perform vocal tract transformation. The second method (segmental pitch contour model) focuses on a more detailed modeling of pitch contours. Both methods(More)
Perception of speaker identity is an important characteristic of the human auditory system. This paper describes a subjective test for the investigation of the relevance of four acoustic features in this process: vocal tract, pitch, duration, and energy. PSOLA based methods provide the framework for the transplantations of these acoustic features between(More)
Generating expressive synthetic voices requires carefully designed databases that contain sufficient amount of expressive speech material. This paper investigates voice conversion and modification techniques to reduce database collection and processing efforts while maintaining acceptable quality and naturalness. In a factorial design, we study the relative(More)