Florian Schiel

Learn More
While the first open comparative challenges in the field of paralinguistics targeted more ‘conventional’ phenomena such as emotion, age, and gender, there still exists a multiplicity of not yet covered, but highly relevant speaker states and traits. The INTERSPEECH 2011 Speaker State Challenge thus addresses two new sub-challenges to overcome the usually(More)
In this paper 1 two diierent models of pronunciation are presented: the rst model is based on a rule set compiled by an expert, while the second is statistically based, exploiting a survey about pronunciation variants occurring in training data. Both models generate pronunciation variants from the canonic forms of words. The two models are evaluated by(More)
In this paper we present a hybrid statistical and rule-based segmentation system which takes into account phonetic variation of German. Input to the system is the orthographic representation and the speech signal of an utterance to be segmented. The output is the transcription (SAM-PA) with the highest overall likelihood and the corresponding segmentation(More)
Institute of Phonetics and Speech Communication University of Munich 80799 Munich +49 89 2180 5751 [beringer, ukartal, kalo, schiel, tuerk]@phonetik.uni−muenchen.de Abstract This paper describes a general framework for evaluating and comparing the performance of multimodal dialogue systems: PROMISE (Procedure for Multimodal Interactive System Evaluation).(More)
This paper is concerned with the definition and description of the phenomenon Off−Talk in human−machine− interaction. This phenomenon is considered to cause problems due to non−relevant information that is conveyed within these utterances. Besides the definition of Off−Talk our work aims to provide an analysis of transcribed audio data that is part of the(More)
The fact that an increasing number of functions in the automobile are and will be controlled by speech of the driver rises the question whether this speech input may be used to detect a possible alcoholic intoxication of the driver. For that matter a large part of the new Alcohol Language Corpus (ALC) edited by the Bavarian Archive of Speech Signals (BAS)(More)
The Translanguage English Database is a corpus of recordings made of oral presentations at Eurospeech93 in Berlin. The corpus name derives from the high percentage of presentations given in English by non-native speakers of English. 224 oral presentations at the conference were successfully recorded, providing a total of about 75 hours of speech material.(More)
In this paper we describe a method to model pronunciation for ASR in the German VERBMOBIL task. Our ndings suggest that a simple model, i.e. pronunciation variants modelled by SAM-PA units and weighted with a-posteriori probabilities, can be used successfully for ASR, if there is a su cient amount of reliably transcribed speech data available. Manual(More)
Most spoken language resources are produced and disseminated together with symbolic information relating to the speech signal. These are for instance orthographic transcripts , labelling and segmentation on the phonologic, pho-netic, prosodic, phrasal level. Most of the known formats for these symbolic data are deened in a 'closed form' that is not exible(More)