Learning improved linear transforms for speech recognition


This paper explores a novel large margin approach to learning a linear transform for dimensionality reduction in speech recognition. The method assumes a trained Gaussian mixture model for each class to be discriminated and trains a dimensionality-reducing linear transform with respect to the fixed model, optimizing a hinge loss on the difference between the distance to the nearest in- and out-of-class Gaussians using stochastic gradient descent. Results are presented showing that the learnt transform improves state classification for individual frames and reduces word error rate compared to Linear Discriminant Analysis (LDA) in a large vocabulary speech recognition problem even after discriminative training.

DOI: 10.1109/ICASSP.2012.6288289

Extracted Key Phrases

6 Figures and Tables

Cite this paper

@article{Senior2012LearningIL, title={Learning improved linear transforms for speech recognition}, author={Andrew W. Senior and Youngmin Cho and Jason Weston}, journal={2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)}, year={2012}, pages={1957-1960} }