Multiclass Support Vector Machines for Articulatory Feature Classification

Abstract

This ongoing research project investigates articulatory feature (AF) classification using multiclass support vector machines (SVMs). SVMs are being constructed for each AF in the multi-valued feature set (Table 1), using speech data and annotation from the IFA Dutch “Open-Source” (van Son et al. 2001) and TIMIT English (Garofolo et al. 1993) corpora. The primary objective of this research is to assess the AF classification performance of different multiclass generalizations of the SVM, including one-versus-rest, one-versus-one, Decision Directed Acyclic Graph (DDAG), and direct methods for multiclass learning. Observing the successful application of SVMs to numerous classification problems (Bennett and Campbell 2000), it is hoped that multiclass SVMs will outperform existing state-of-the-art AF classifiers. One of the most basic challenges for speech recognition and other spoken language systems is to accurately map data from the acoustic domain into the linguistic domain. Much speech processing research has approached this task by taking advantage of the correlation between phones, the basic units of speech sound, and their acoustic manifestation (intuitively, there is a range of sounds that humans would consider to be an “e”). The mapping of acoustic data to phones has been largely successful, and is used in many speech systems today. Despite its success, there are drawbacks to using phones as the point of entry from the acoustic to linguistic domains. Notably, the granularity of the “phoneticsegmental” model, in which speech is represented as a series of phones, makes it difficult to account for various subphone phenomena that affect performance on spontaneous speech. Researchers have pursued an alternative approach to the acoustic-linguistic mapping through the use of articulatory modeling. This approach more directly exploits the intimate relation between articulation and acoustics: the state of one’s speech articulators (e.g. vocal folds, tongue) uniquely determines the parameters of the acoustic speech signal. Unfortunately, while the mapping from articulator to acoustics is straightforward, the problem of recovering the state of the articulators from an acoustic speech representation, acoustic-to-articulatory inversion, poses a formidable challenge (Toutios and Margaritis 2003). Nevertheless, re-

Extracted Key Phrases

2 Figures and Tables

Cite this paper

@inproceedings{Hutchinson2006MulticlassSV, title={Multiclass Support Vector Machines for Articulatory Feature Classification}, author={Brian Hutchinson and Jianna Zhang}, booktitle={AAAI}, year={2006} }