Sebastian Ochs

Learn More
In this paper, we analyze whether dictionaries from the World Wide Web which contain phonetic notations, may support the rapid creation of pronunciation dictionaries within the speech recognition and speech synthesis system building process. As a representative dictionary, we selected Wiktionary [1] since it is at hand in multiple languages and, in addition(More)
We report on our efforts toward an LVCSR system for the African language Hausa. We describe the Hausa text and speech database recently collected as a part of our GlobalPhone corpus [1]. The data was complemented by a large collection of text data crawled from various Hausa websites. We achieve significant improvement by automatically substituting(More)
In this paper, we evaluate grapheme-to-phoneme (g2p) models among languages and of different quality. We created g2p models for Indo-European languages with word-pronunciation pairs from the GlobalPhone project and from Wiktionary [1]. Then we checked their quality in terms of consistency and complexity as well as their impact on Czech, English, French,(More)
In this paper, we present our latest investigations on pronunciation modeling and its impact on ASR. We propose completely automatic methods to detect, remove, and substitute inconsistent or flawed entries in pronunciation dictionaries. The experiments were conducted on different tasks, namely (1) word-pronunciation pairs from the Czech, English, French,(More)
  • 1