Learn More
When using Weighted Finite State Transducers (WFSTs) in speech recognition, on-the-fly composition approaches have been proposed as a method of reducing memory consumption and increasing flexibility during decoding. We have recently implemented several fast on-the-fly techniques, namely avoiding dead-end states, dynamic pushing and state sharing in our(More)
A new method for prosodic word boundary detection in continuous speech was developed based on the statistical modeling of moraic transitions of fundamental frequency (F0) contours, formerly proposed by the authors. In the developed method, F 0 contours of prosodic words were mod-eled separately according to the accent types. An input utterance was matched(More)
Text corpus size is an important issue when building a language model (LM). This is a particularly important issue for languages where little data is available. This paper introduces an LM adaptation technique to improve an LM built using a small amount of task-dependent text with the help of a machine-translated text corpus. Icelandic speech recognition(More)
In the field of audiovisual speech recognition, multi-stream HMMs are widely used, thus how to automatically and properly determine stream weight factors using a small data set becomes an important research issue. This paper proposes a new stream-weight optimization method based on an output likelihood normalization criterion. In this method, the stream(More)
In this paper we present a new method for synthesizing multiple languages with the same voice, using HMM-based speech synthesis. Our approach, which we call HMM-based polyglot synthesis, consists of mixing speech data from several speakers in different languages, to create a speaker-and language-independent (SI) acoustic model. We then adapt the resulting(More)