Structured Support Vector Machines for Speech Recognition

Abstract

Discriminative training criteria and discriminative models are two ešective improvements for HMM-based speech recognition. is thesis proposed a structured support vector machine (SSVM) framework suitable for medium to large vocabulary continuous speech recognition. An important aspect of structured SVMs is the form of features. Several previously proposed features in the eld are summarized in this framework. Since some of these features can be extracted based on generative models, this provides an elegant way of combine generative and discriminative models. To apply the structured SVMs to continuous speech recognition, a number of issues need to be addressed. First, features require a segmentation to be specied. To incorporate the optimal segmentation into the training process, the training algorithm is modied making use of the concave-convex optimisation procedure. A Viterbi-style algorithm is described for inferring the optimal segmentation based on discriminative parameters. Second, structured SVMs can be viewed as large margin log linear models using a zero mean Gaussian prior of the discriminative parameter. However this form of prior is not appropriate for all features. An extended training algorithm is proposed that allows general Gaussian priors to be incorporated into the large margin criterion. ird, to speed up the training process, strategies of parameter tying, 1-slack optimisation, caching competing hypotheses, lattice constrained search and parallelization, are also described. Finally, to avoid explicitly computing in the high dimensional feature space and to achieve the nonlinear decision boundaries, kernel based training and decoding algorithms are also proposed. e performance of structured SVMs is evaluated on small and medium to large speech recognition tasks: AURORA 2 and 4.

Cite this paper

@inproceedings{Zhang2014StructuredSV, title={Structured Support Vector Machines for Speech Recognition}, author={Shi-Xiong Zhang}, year={2014} }