Corpus ID: 2332789

Dynamic pronunciation models for automatic speech recognition

@inproceedings{FoslerLussier1999DynamicPM,
  title={Dynamic pronunciation models for automatic speech recognition},
  author={John Eric Fosler-Lussier and N. Morgan},
  year={1999}
}
As of this writing, the automatic recognition of spontaneous speech by computer is fraught with errors; many systems transcribe one out of every three to five words incorrectly, whereas humans can transcribe spontaneous speech with one error in twenty words or better. This high error rate is due in part to the poor modeling of pronunciations within spontaneous speech. This dissertation examines how pronunciations vary in this speaking style, and how speaking rate and word predictability can be… Expand
Feature-based pronunciation modeling for automatic speech recognition
TLDR
A class of feature-based pronunciation models represented as dynamic Bayesian networks (DBNs) is proposed, which allows the factorization of the state space of feature combinations into feature specific factors, as well as providing standard algorithms for inference and parameter learning. Expand
Automatic determination of sub-word units for automatic speech recognition
TLDR
This thesis presents a method for the automatic derivation of a sub-word unit inventory, whose main components are an ergodic hidden Markov model whose complexity is controlled using the Bayesian Information Criterion and an automatic generation of probabilistic dictionaries using joint multigrams. Expand
Pronunciation variation modeling for Dutch automatic speech recognition
TLDR
How the performance of a Dutch continuous speech recognizer was improved by modeling pronunciation variation is described, which consists of adding pronunciation variants to the lexicon, retraining phone models and using language models to which the pronunciation variants have been added. Expand
STATIC DICTIONARY FOR PRONUNCIATION MODELING
TLDR
This work is to improve the speech recognition accuracy for Telugu language using pronunciation model, and new pronunciations for the words will be considered forspeech recognition accuracy. Expand
Techniques for modelling Phonological Processes in Automatic Speech Recognition
TLDR
This dissertation focuses on techniques that aim to improve the robustness of statistical speech transcription systems to conversational speaking styles and proposes a new FHMM the Parameter-Tied FH MM which makes fewer a-priori assumptions about the data to be modelled. Expand
MODELING PRONUNCIATION VARIATION IN CONVERSATIONAL SPEECH USING PROSODY
A significant source of variation in spontaneous speech is due to intra-speaker pronunciation changes. Previous work in automatic speech recognition has identified several factors that affectExpand
A Tutorial on Pronunciation Modeling for Large Vocabulary Speech Recognition
TLDR
This tutorial is intended to ground the reader in the basic linguistic concepts in phonetics and phonology that guide both of these techniques and to outline several pronunciation modeling strategies that have been employed through the years. Expand
Modeling Pronunciation Variation in Conversational Speech using Syntax and Discourse
A significant source of variation in spontaneous speech is due to intra-speaker pronunciation changes. Previous work in automatic speech recognition has identified several factors that affectExpand
Hybrid statistical pronunciation models designed to be trained by a medium-size corpus
Generating pronunciation variants of words is an important subject in speech research and is used extensively in automatic speech recognition and segmentation systems. Decision trees are well knownExpand
Subword Modeling for Automatic Speech Recognition: Past, Present, and Emerging Approaches
TLDR
Different subword models may be preferable in different settings, such as high-variability conversational speech, high-noise conditions, low-resource settings, or multilingual speech recognition. Expand
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 198 REFERENCES
A new approach to speaker adaptation by modelling pronunciation in automatic speech recognition
TLDR
An algorithm is proposed which observes the typical confusions of phonetic units of the unknown speaker and adapts the a posteriori probabilities continuously and is presented as a solution to enhance speech recognition by automatically adapting the models of pronunciation in the lexicon to theunknown speaker. Expand
Automatic modeling of pronunciation variations
  • E. Eide
  • Computer Science
  • EUROSPEECH
  • 1999
TLDR
An automatic method for discovering an appropriate model for each contextdependent phoneme which allows for such phenomena as reduced pronunciations and substituted phonemes where warranted by observation on training data is reported on. Expand
Multiple-pronunciation lexical modeling in a speaker independent speech understanding system
TLDR
An algorithm for the construction of models that attempt to capture the variation that occurs in the pronunciations of words in spontaneous speech, which improves the performance of both the speech recognition and the speech understanding components of the BeRP system. Expand
Dictionary learning for spontaneous speech recognition
  • T. Sloboda, A. Waibel
  • Computer Science
  • Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP '96
  • 1996
TLDR
This work proposed a data driven approach to add new pronunciations to a given phonetic dictionary in a way that they model the given occurrences of words in the database and shows how this algorithm can be extended to produce alternative pronunciation for word tuples and frequently misrecognized words. Expand
Pronunciation modeling for large vocabulary conversational speech recognition
TLDR
The hand-labelled corpus scheme is adopted to improve pronunciations for frequent multi and single words occurring in the training data, while using the rule-based techniques to learn pronunciation variants and their weights for the infrequent words. Expand
Pronunciation modeling by sharing gaussian densities across phonetic models
TLDR
The incorporation of pronunciation models into acoustic model training in addition to recognition is described, showing a 1.7 % improvement in recognition accuracy on the Switchboard corpus is presented. Expand
INCORPORATING CONTEXTUAL PHONETICS INTO AUTOMATIC SPEECH RECOGNITION
This work outlines the problems encountered in modeling pronunciation for automatic speech recognition (ASR) of spontaneous (American) English speech. We detail some of the phonetic phenomena withinExpand
Speaking mode dependent pronunciation modeling in large vocabulary conversational speech recognition
TLDR
A framework for speaking mode dependent pronunciation modeling is presented and the framework is successfully applied to increase the performance of the state-of-the-art Janus Recognition Toolkit Switchboard recognizer. Expand
Speaking in shorthand - A syllable-centric perspective for understanding pronunciation variation
TLDR
Systematic analysis of pronunciation variation in a corpus of spontaneous English discourse (Switchboard) demonstrates that the variation observed is more systematic at the level of the syllable than at the phonetic-segment level, and syllabic onsets are realized in canonical form far more frequently than either coda or nuclear constituents. Expand
Modeling and efficient decoding of large vocabulary conversational speech
TLDR
A night state machine single-pre x-tree, one-pass, time-synchronous decoder is presented that decodes highly spontaneous speech within this new representational framework. Expand
...
1
2
3
4
5
...