Robert A. J. Clark

Learn More
This paper describes version 2 of the Festival speech synthesis system. Festival 2 provides a development environment for concatenative speech synthesis, and now includes a general purpose unit selection speech synthesis engine. We discuss various aspects of unit selection speech synthesis, focusing on the research issues that relate to voice design and the(More)
We present the implementation and evaluation of an open-domain unit selection speech synthesis engine designed to be flexible enough to encourage further unit selection research and allow rapid voice development by users with minimal speech synthesis knowledge and experience. We address the issues of automatically processing speech data into a usable voice(More)
OBJECTIVE This study evaluated the effectiveness of Helping Babies Breathe (HBB) newborn care and resuscitation training for birth attendants in reducing stillbirth (SB), and predischarge and neonatal mortality (NMR). India contributes to a large proportion of the worlds annual 3.1 million neonatal deaths and 2.6 million SBs. METHODS This prospective(More)
Current unit selection speech synthesis voices cannot produce emphasis or interrogative contours because of a lack of the necessary prosodic variation in the recorded speech database. A method of recording script design is proposed which addresses this shortcoming. Appropriate components were added to the target cost function of the Festival Multisyn(More)
We describe the results of large scale perception experiments showing improvements in synthesising two distinct kinds of prominence: standard pitch-accent and strong emphatic accents. Previously prominence assignment has been mainly evaluated by computing accuracy on a prominence-labelled test set. By contrast we integrated an automatic pitch-accent(More)
The goal of this project was to build a unit selection voice that could portray emotions with varying intensities. A suitable definition of an emotion was developed along with a descriptive framework that supported the work carried out. A single speaker was recorded portraying happy and angry speaking styles. Additionally a neutral database was also(More)
Unit selection speech synthesis has reached high levels of naturalness and intelligibility for neutral read aloud speech. However, synthetic speech generated using neutral read aloud data lacks all the attitude, intention and spontaneity associated with everyday conversations. Unit selection is heavily data dependent and thus in order to simulate human(More)
This paper addresses the problem of automatically labeling contrast in spontaneous spoken speech, where contrast here is meant as a relation that ties two words that explicitly contrast with each other. Detection of contrast is certainly relevant in the analysis of discourse and information structure and also, because of the prosodic correlates of contrast,(More)