Learn More
—This paper presents our study on different phonetic segmentation methods based on hidden Markov models evaluated against a Hebrew speech corpus. We investigated methods for fully automatic phonetic segmentation using only the corpus which should be segmented and automatically generated phonetic transcriptions. A new method for phonetic boundary correction(More)
This paper presents our initial results in a new approach to vocal tract normalization (VTN). In experiments based on continuous automatic speech recognition (ASR) the VTN procedure is in general carried out in both training and test phase. In the training phase it is used to obtain speaker independent acoustic models of phones. In the test phase it is used(More)
In this paper a novel method for energy normalization is presented. The objective of this method is to remove unwanted energy variations caused by different microphone gains, various loudness levels across speakers, as well as changes of single speaker loudness level over time. The solution presented here is based on principles used in automatic gain(More)
This paper reports a spoken natural language dialogue system that manages the interaction between the user and the industrial robot ABB IRB 140. To the extent that the dialogue system is multimodal, it uses three communication modalities: (i) spoken language (automatic speech recognition and text-to-speech synthesis), (ii) visual recognition of the figures(More)
This paper proposes a method of creating language models for highly inflective non-agglutinative languages. Three types of language models were considered - a common n-gram model, an n-gram model of lemmas and a class n-gram model. The last two types were specially designed for the Serbian language reflecting its unique grammar structure. All the language(More)
This paper describes a decoder for large vocabulary continuous speech recognition developed at the Faculty of Technical Sciences, University of Novi Sad. The decoder is an open source solution written in the C++ programming language. The structure of the decoder is modular, allowing relatively simple modification and expansion of the code. It(More)