ALBAYZIN 2016 spoken term detection evaluation: an international open competitive evaluation in Spanish
This paper presents a combination of out-of-vocabulary (OOV) word modeling and rejection techniques in an attempt to accept utterances embedding a keyword and reject utterances with nonkeywords. The goal of this research is to develop a robust, task-independent Spanish keyword spotter and to develop a method for optimizing confidence thresholds for a particular context. To model OOV words, we employed both word and sub-word units as fillers, combined with n-gram language models. We also introduce a methodology for optimizing confidence thresholds to control the tradeoffs between acceptance, confirmation, and rejection of utterances. Our experiments are based on a Mexican Spanish auto-attendant system using the SpeechWorks recognizer release 6.5 Second Edition, in which we achieved a reduction in error of 8.9% as compared to the baseline system. Most of the error reduction is attributed to better keyword detection in utterances that contain both keywords and OOV words.