Corpus ID: 14290584

Kullback-Leibler Divergence-Based ASR Training Data Selection

@inproceedings{Gouva2011KullbackLeiblerDA,
  title={Kullback-Leibler Divergence-Based ASR Training Data Selection},
  author={E. Gouv{\^e}a and Marelie Hattingh Davel},
  booktitle={INTERSPEECH},
  year={2011}
}
European Media Laboratory GmbH, Heidelberg, Germany Multilingual Speech Technologies, North-West University, Vanderbijlpark, South Africa 
Automatic speech recognition for resource–scarce environments
Thesis (PhD (Computer and Electronic Engineering))--North-West University, Potchefstroom Campus, 2013.
Efficient data selection for ASR
TLDR
It is shown that for limited data sets, independent of language and bandwidth, the most effective strategy for data selection is frequency-matched selection and that the widely-used maximum entropy methods generally produced the least promising results. Expand
Methods for addressing data diversity in automatic speech recognition
TLDR
Using data selection, data augmentation and latent-domain model adaptation methods the mismatch between training and testing conditions of diverse ASR systems are reduced, resulting in more robust speech recognition systems. Expand
Data-selective transfer learning for multi-domain speech recognition
TLDR
A novel technique to overcome negative transfer by efficient selection of speech data for acoustic model training by choosing data chosen on relevance for a specific target, evaluated on a wide–domain data set. Expand
Rapid Development of TTS Corpora for Four South African Languages
TLDR
The approach followed investigated the possibility of using low-cost methods including informal recording environments and untrained volunteer speakers and code-switched data to develop text-to-speech corpora for four South African languages. Expand
A submodular optimization approach to sentence set selection
  • Yusuke Shinohara
  • Computer Science
  • 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
  • 2014
TLDR
A new method for selecting a sentence set with a desired phoneme distribution is presented, and it is shown that a greedy algorithm is near-optimal for this problem, according to the submodular optimization theory. Expand

References

SHOWING 1-10 OF 14 REFERENCES
The RWTH aachen university open source speech recognition system
TLDR
The toolkit includes state of the art speech recognition technology for acoustic model training and decoding, and a finite state automata library, and an efficient tree search decoder are notable components. Expand
Woefzela - An Open-Source Platform for ASR Data Collection in the Developing World
This project was made possible through the support of the South African National Centre for Human Language Technology, an initiative of the South African Department of Arts and Culture. TheExpand
An Iterative Relative Entropy Minimization-Based Data Selection Approach for n-Gram Model Adaptation
TLDR
It is shown that the proposed subset selection scheme leads to performance improvements over state of the art speech recognition systems in terms of both speech recognition word error rate (WER) and language model perplexity (PPL). Expand
Methods for optimal text selection
TLDR
This work addresses how one can take advantage of control over the content of the speech data base, by discussing a number of variants of “greedy” text selection methods and showing their application in a variety of examples. Expand
The Design for the Wall Street Journal-based CSR Corpus
TLDR
This paper presents the motivating goals, acoustic data design, text processing steps, lexicons, and testing paradigms incorporated into the multi-faceted WSJ CSR Corpus, a corpus containing significant quantities of both speech data and text data. Expand
Data selection for speech recognition
TLDR
In contrast to the common belief that "there is no data like more data", it is found possible to select a highly informative subset of data that produces recognition performance comparable to a system that makes use of a much larger amount of data. Expand
Building transcribed speech corpora quickly and cheaply for many languages
TLDR
A system for quickly and cheaply building transcribed speech corpora containing utterances from many speakers in a variety of acoustic conditions, used to collect over 3000 hours of transcribed audio in 17 languages around the world. Expand
Approaches to automatic lexicon learning with limited training examples
TLDR
A combination of lexicon learning techniques are used to explore whether a lexicon can be learned when only a small lexicon is available for boot-strapping and discover that for a phonetic language such as Spanish, it is possible to do that better than what is possible from generic rules or hand-crafted pronunciations. Expand
Pattern Recognition and Machine Learning
TLDR
Probability Distributions, linear models for Regression, Linear Models for Classification, Neural Networks, Graphical Models, Mixture Models and EM, Sampling Methods, Continuous Latent Variables, Sequential Data are studied. Expand
Pattern Recognition and Machine Learning (Information Science and Statistics)
Looking for competent reading resources? We have pattern recognition and machine learning information science and statistics to read, not only read, but also download them or even check out online.Expand
...
1
2
...