Learn More
In this paper, we describe how a research spoken dialog system was made available to the general public. The Let's Go Public spoken dialog system provides bus schedule information to the Pittsburgh population during off-peak times. This paper describes the changes necessary to make the system usable for the general public and presents analysis of the calls(More)
This paper describes our work with Let's Go, a telephone-based bus schedule information system that has been in use by the Pittsburgh population since March 2005. Results from several studies show that while task success correlates strongly with speech recognition accuracy, other aspects of dialogue such as turn-taking, the set of error recovery strategies(More)
With the recent improvements in speech technology, it is now possible to build spoken dialog systems that basically work. However, such systems are designed and tailored for the general population. When users come from less general sections of the population, such as the elderly and non-native speakers of English, the accuracy of dialog systems degrades.(More)
Although the quality of synthetic speech has increased dramatically in the past several years, many people still have difficulty understanding speech produced by even the highest quality synthesizers. We describe an approach to improve understandability of synthetic speech using speech in noise. Natural speech in noise is a change in the style of speech(More)
Spoken dialog systems typically use a limited number of non-understanding recovery strategies and simple heuristic policies 1 to engage them (e.g. first ask user to repeat, then give help, then transfer to an operator). We propose a supervised, online method for learning a non-understanding recovery policy over a large set of recovery strategies. The(More)
This paper describes CMU SIN, a new database of speech in noise that can be used for unit selection speech synthesis. We describe a process that can be used to elicit speech in noise and how to use that as part of building a synthetic voice that speaks in noise. Details of the database we constructed , as well as some preliminary analysis and future goals(More)
In CMU's Blizzard Challenge 2005 entry we investigated twelve ideas for improving Festival-based unit selection voices. We tracked progress by adopting a 3-tiered strategy in which candidate ideas must pass through three stages of listening tests to warrant inclusion in the final build. This allowed us to evaluate ideas consistently without us having large(More)
Speech Synthesizers have traditionally been built on carefully read speech that is recorded in studio environment. Such voices are suboptimal for use in noisy conditions, which is inevitable in a majority of deployed speech systems. In this work, we attempt to modify the output of the speech synthesizers to make it more appropriate for noisy environments.(More)