César González Ferreras

Learn More
Literature review on prosody reveals the lack of corpora for prosodic studies in Catalan and Spanish. In this paper, we present a corpus intended to fill this gap. The corpus comprises two distinct data-sets, a news subcorpus and a dialogue subcorpus, the latter containing either conversational or task-oriented speech. More than 25 h were recorded by twenty(More)
This contribution faces the ToBI accent recognition problem with the goal of multiclass identification vs. the more conservative Accent vs. No Accent approach. A neural network and a decision tree are used for automatic recognition of the ToBI accents in the Boston Radio Corpus. Multiclass classification results show the difficulty of the problem and the(More)
This paper presents a system that automatically labels tones and break indices (ToBI) events. The detection (binary classification) of prosodic events has received significantly more attention from researchers than its classification because of the intrinsic difficulty of classification. We focus on the classification problem, identifying eight types of(More)
In this work, we discuss the construction process of the voice portal counterpart of a departmental web site. VoiceXML has been used as the dialogue modelling language. A prototypical system has been built using our own VoiceXML interpreter, which easily integrates different implementation platforms. A general discussion of VoiceXML advantages and(More)
This paper presents an experimental study on how corpus-based automatic prosodic information labeling can be transferred from a source language to a different target language. Tone accent identification models trained for Span-ish, using the ESMA corpus, are used to automatically assign tonal accent ToBI labels on the (English) Boston Radio news corpus, and(More)
In this work we present SAMPLE, a new pronunciation database of Spanish as L2, and first results on the automatic assessment of Non-native prosody. Listen and repeat and read tasks are carried out by native and foreign speakers of Spanish. The corpus has been designed to support comparative studies and evaluation of automatic pronunciation error assessment(More)
In this paper, we present the application of a novel automatic prosodic labeling methodology for speeding up the manual labeling of the Glissando corpus (Spanish read news items). The methodology is based on the use of soft classification techniques. The output of the automatic system consists on a set of label candidates per word. The number of predicted(More)