Machine Learning Paradigms for Speech Recognition: An Overview

@article{Deng2013MachineLP,
  title={Machine Learning Paradigms for Speech Recognition: An Overview},
  author={Li Deng and Xiao Li},
  journal={IEEE Transactions on Audio, Speech, and Language Processing},
  year={2013},
  volume={21},
  pages={1060-1089}
}
  • L. Deng, Xiao Li
  • Published 1 May 2013
  • Computer Science
  • IEEE Transactions on Audio, Speech, and Language Processing
Automatic Speech Recognition (ASR) has historically been a driving force behind many machine learning (ML) techniques, including the ubiquitously used hidden Markov model, discriminative learning, structured sequence learning, Bayesian learning, and adaptive learning. [] Key Method These learning paradigms are motivated and discussed in the context of ASR technology and applications. We finally present and analyze recent developments of deep learning and learning with sparse representations, focusing on…

Figures and Tables from this paper

Speech recognition in a dialog system: from conventional to deep processing
The aim of this paper is to illustrate an overview of the automatic speech recognition (ASR) module in a spoken dialog system and how it has evolved from the conventional GMM-HMM (Gaussian mixture
AUTOMATIC LANGUAGE RECOGNITION USING DEEP NEURAL NETWORKS – TRABAJO FIN DE MÁSTER –
TLDR
This Master Thesis is intended to provide a new approach that, combining both deep learning and automatic language recognition fields, improves the SLR task by getting a better representation of voice signals for classification purposes so that it can be identified which language has been used in that voice signal.
Speech Recognition Using Convolutional Neural Networks
TLDR
An ASR based Airport enquiry system has been developed natively for telugu language and Convolutional Neural Network has been used for training and testing of the database.
Unsupervised adaptation of ASR systems: An application of dynamic programming in machine learning
TLDR
An adaptation framework is proposed using DPW algorithm to enable the Automatic Speech Recognition (ASR) systems to learn from the unlabeled data to make the ASR systems inexpensive, fast and improves performance of the existing systems.
A Review on Automatic Speech Recognition Architecture and Approaches
TLDR
A detailed study on automatic speech recognition is carried out and presented in this paper that covers the architecture, speech parameterization, methodologies, characteristics, issues, databases, tools and applications.
Advanced Data Exploitation in Speech Analysis: An overview
TLDR
This work states that there is a greater need for high-quality, diverse, and very large amounts of data in terms of ASA system accuracy and robustness, enabling the extraction of feature representations or the learning of model parameters immune to confounding factors.
Analysing acoustic model changes for active learning in automatic speech recognition
TLDR
An alternative view on transcript label quality is looked into, in which Gaussian Supervector Distance (GSD) is used as a criterion for data selection, which finds that GSD provide hints for predicting data transcription quality.
Advances in Artificial intelligence Using Speech Recognition
TLDR
This research intends to present the illustration of recent technological advancements, which are associated with artificial intelligence, to help in understanding all of the statistical models of speech recognition.
From Caesar Cipher to Unsupervised Learning: A New Method for Classifier Parameter Estimation
TLDR
This article introduces the concept of Caesar Cipher and its decryption, which motivated the construction of the novel loss function for unsupervised learning, and disseminates a class of promising new methods to facilitate understanding the methods for machine learning researchers.
Transfer learning for speech and language processing
  • Dong Wang, T. Zheng
  • Computer Science
    2015 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA)
  • 2015
TLDR
This review paper summarizes some recent prominent research towards transfer learning, particularly for speech and language processing, and highlights the potential of this very interesting research field.
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 300 REFERENCES
Active learning: theory and applications to automatic speech recognition
TLDR
This paper describes how to estimate the confidence score for each utterance through an on-line algorithm using the lattice output of a speech recognizer and shows that the amount of labeled data needed for a given word accuracy can be reduced by more than 60% with respect to random sampling.
Structured Discriminative Models For Speech Recognition: An Overview
TLDR
A variety of approaches for applying structured discriminative models to ASR are discussed, both from the current literature and possible future approaches, focusing on structured models themselves, the descriptive features of observations commonly used within the models, and various options for optimizing the parameters of the model.
Template-Based Continuous Speech Recognition
TLDR
The template matching system reaches a performance somewhat worse than the best published HMM results for the Resource Management benchmark, but thanks to complementarity of errors between the HMM and DTW systems, the combination of both leads to a decrease in word error rate.
Automatic Speech Recognition Based on Non-Uniform Error Criteria
TLDR
This paper proposes an extended framework for the speech recognition problem with non-uniform classification/recognition error cost which can be controlled by the system designer, and addresses the issue of system model optimization when the cost of a recognition error is class dependent.
Connectionist Speech Recognition: A Hybrid Approach
From the Publisher: Connectionist Speech Recognition: A Hybrid Approach describes the theory and implementation of a method to incorporate neural network approaches into state-of-the-art continuous
Discriminative Learning for Speech Recognition: Theory and Practice
TLDR
Details on major algorithmic implementation issues with practical significance are provided to enable the practitioners to directly reduce the theory in the earlier part of the book into engineering practice.
Robust continuous speech recognition using parallel model combination
TLDR
After training on clean speech data, the performance of the recognizer was found to be severely degraded when noise was added to the speech signal at between 10 and 18 dB, but using PMC the performance was restored to a level comparable with that obtained when training directly in the noise corrupted environment.
Structured SVMs for Automatic Speech Recognition
TLDR
A Viterbi-like scheme for obtaining the “optimal” segmentation of the utterance, and a modified training algorithm is proposed that allows general Gaussian priors to be incorporated into the large margin criterion.
Unsupervised and active learning in automatic speech recognition for call classification
TLDR
A novel approach that aims at reducing the amount of manually transcribed in-domain data required for building automatic speech recognition (ASR) models in spoken language dialog systems based on mining relevant text from various conversational systems and Web sites is presented.
Unsupervised training of a speech recognizer: recent experiments
TLDR
This work describes experiments which are aimed at training a speech recognizer with only a minimal amount of transcriptions and a large portion of untranscribed data, and shows that this performance cannot be improved by improving the measure of con dence.
...
1
2
3
4
5
...