Learn More
In this paper, we describe how a research spoken dialog system was made available to the general public. The Let's Go Public spoken dialog system provides bus schedule information to the Pittsburgh population during off-peak times. This paper describes the changes necessary to make the system usable for the general public and presents analysis of the calls(More)
This paper describes our work with Let's Go, a telephone-based bus schedule information system that has been in use by the Pittsburgh population since March 2005. Results from several studies show that while task success correlates strongly with speech recognition accuracy, other aspects of dialogue such as turn-taking, the set of error recovery strategies(More)
With the recent improvements in speech technology, it is now possible to build spoken dialog systems that basically work. However, such systems are designed and tailored for the general population. When users come from less general sections of the population, such as the elderly and non-native speakers of English, the accuracy of dialog systems degrades.(More)
This paper describes work designed to improve understandability of spoken output, specifically for the elderly, by using a speaking style employed by people to improve their understandability when speaking in poor channel conditions. We describe an experiment that shows the understandability gains that are possible using naturally-produced examples of this(More)
Although the quality of synthetic speech has increased dramatically in the past several years, many people still have difficulty understanding speech produced by even the highest quality synthesizers. We describe an approach to improve understandability of synthetic speech using speech in noise. Natural speech in noise is a change in the style of speech(More)
Spoken dialog systems typically use a limited number of non-understanding recovery strategies and simple heuristic policies 1 to engage them (e.g. first ask user to repeat, then give help, then transfer to an operator). We propose a supervised, online method for learning a non-understanding recovery policy over a large set of recovery strategies. The(More)
This paper describes CMU SIN, a new database of speech in noise that can be used for unit selection speech synthesis. We describe a process that can be used to elicit speech in noise and how to use that as part of building a synthetic voice that speaks in noise. Details of the database we constructed , as well as some preliminary analysis and future goals(More)
This paper describes CMU's entry for the Blizzard Challenge 2007. Our eventual system consisted of a hybrid statistical parameter generation system whose output was used to do acoustic unit selection. After testing a number of varied systems, this system proved the best in our internal tests. This paper also explains some of the limitations we see in our(More)
This report describes speech in noise, a speaking style employed by people to improve understandability when speaking in noisy conditions. Evidence of understandability improvements for natural speech is shown, including recent experimental data. As even the highest quality speech synthesizers can be difficult to understand, the viability of using this(More)