Out-of-Vocabulary Word Modeling and Rejection for Spanish Keyword Spotting Systems

Abstract

This paper presents a combination of out-of-vocabulary (OOV) word modeling and rejection techniques in an attempt to accept utterances embedding a keyword and reject utterances with nonkeywords. The goal of this research is to develop a robust, task-independent Spanish keyword spotter and to develop a method for optimizing confidence thresholds for a particular context. To model OOV words, we employed both word and sub-word units as fillers, combined with n-gram language models. We also introduce a methodology for optimizing confidence thresholds to control the tradeoffs between acceptance, confirmation, and rejection of utterances. Our experiments are based on a Mexican Spanish auto-attendant system using the SpeechWorks recognizer release 6.5 Second Edition, in which we achieved a reduction in error of 8.9% as compared to the baseline system. Most of the error reduction is attributed to better keyword detection in utterances that contain both keywords and OOV words.

DOI: 10.1007/3-540-46016-0_17

Extracted Key Phrases

7 Figures and Tables

Cite this paper

@inproceedings{Cuayhuitl2002OutofVocabularyWM, title={Out-of-Vocabulary Word Modeling and Rejection for Spanish Keyword Spotting Systems}, author={Heriberto Cuay{\'a}huitl and Ben Serridge}, booktitle={MICAI}, year={2002} }