Colin W. Wightman

Learn More
Numerous studies have indicated that prosodic phrase boundaries may be marked by a variety of acoustic phenomena including segmental lengthening. It has not been established, however, whether this lengthening is restricted to the immediate vicinity of the boundary, or if it extends over some larger region. In this study, segmental lengthening in the(More)
Prosody is an important factor in the quality of text-tospeech (TTS) synthesis. Typically, acoustic parameters such as f0 and duration are the only variables related to prosody that are used to determine unit selection. Our study explored adding the explicit use of linguistically and perceptually motivated prosodic categories in unit selection-based TTS.(More)
We address the role of prosody as a potential information source for the assignment of syntactic structure. We consider the perceptual role of prosody in marking syntactic breaks of various kinds for human listeners, the automatic extraction of prosodic information, and its correlation with perceptual data. I N T R O D U C T I O N Prosodic information can(More)
In the decade that has passed since the introduction of the ToBI system for the transcription of prosody, speech technology has moved out of the laboratory and into commercial applications on several fronts. However, virtually none of the commercial products have made large-scale use of prosody. Nevertheless, researchers in both recognition and synthesis(More)
We describe the modification of a grammar to take advantage of prosodic information automatically extracted from speech. The work includes (1) the development of an integer "break index" representation of prosodic phrase boundary information, (2) the automatic detection of prosodic phrase breaks using a hidden Markov model on relative duration of phonetic(More)
The AT&T text-to-speech (TTS) synthesis system has been used as a framework for experimenting with a perceptuallyguided data-driven approach to speech synthesis, with primary focus on data-driven elements in the \back end". Statistical training techniques applied to a large corpus are used to make decisions about predicted speech events and selected speech(More)