Céu Viana

Learn More
Classroom lectures may be very challenging for automatic speech recognizers, because the vocabulary may be very specific and the speaking style very spontaneous. Our first experiments using a recognizer trained for Broadcast News resulted in word error rates near 60%, clearly confirming the need for adaptation to the specific topic of the lectures, on one(More)
Navigation queries are typical examples of contexts in which a recognizer may have to deal with non-native names. In order to build a pronunciation lexicon with these names, special GtoP rules may be derived. The paper addresses this problem in the context of navigation queries in French including German names and vice-versa. The special GtoP rules were(More)
This paper reports on recent work in the context of the activities of the PoSTPort project aimed at porting a Broadcast News recognition system originally developed for European Portuguese to other varieties. Concretely, in this paper we have focused on porting to Brazilian Portuguese. The impact of some of the main sources of variability has been assessed,(More)
This paper describes the corpus of university lectures that has been recorded in European Portuguese, and some of the recognition experiments we have done with it. The highly specific topic domain and the spontaneous speech nature of the lectures are two of the most challenging problems. Lexical and language model adaptation proved difficult given the(More)
This paper describes the early stages of porting REAP, a tutoring system for vocabulary learning, to European Portuguese. Students learn from authentic materials, on topics of their preference. A large number of linguistic resources and filtering tools have already been integrated into the ported version. We modified the current system to also target oral(More)
This paper describes a new generic text-to-speech synthesis system, developed in the scope of the Tecnovoz Project. Although it was primarily targeted at speech synthesis in European Portuguese, its modular architecture and flexible components allows its use for different languages. We also provide a survey on the development of the language resources(More)
In this paper we describe a set of experiments aiming at building and evaluating a new phrasing module for European Portuguese Text-to-Speech Synthesis, using Classification and Regression Tree (CART) techniques on hand-labeled texts. Using the assessment criteria of matching boundary predictions against a reference example of phrased sentences, the best(More)
This paper describes the software architecture of the Portuguese text-to-speech system DIXI 1. The system has three major modules. The rst one contains the text normalizer and searches each w ord in the lexicon. The second one is a multi-level rule based module for lexical stress assignment, orthographic to pho-netic transcription, metrically based prosodic(More)
This paper describes a language/accent verification system for Portuguese, that explores different type of properties: acoustic, phonotactic and prosodic. The two-stage system is designed to be used as a pre-processing module for the Portuguese Automatic Speech Recognition (ASR) system developed at INESC-ID. As the ASR system is applied everyday to(More)