Optimising Figure of Merit for phonetic spoken term detection

Abstract

This paper introduces a novel technique to directly optimise the Figure of Merit (FOM) for phonetic spoken term detection. The FOM is a popular measure of STD accuracy, making it an ideal candidate for use as an objective function. A simple linear model is introduced to transform the phone log-posterior probabilities output by a phone classifier to produce enhanced log-posterior features that are more suitable for the STD task. Direct optimisation of the FOM is then performed by training the parameters of this model using a nonlinear gradient descent algorithm. Substantial FOM improvements of 11% relative are achieved on held-out evaluation data, demonstrating the generalisability of the approach.

DOI: 10.1109/ICASSP.2010.5494969

Extracted Key Phrases

4 Figures and Tables

Cite this paper

@article{Wallace2010OptimisingFO, title={Optimising Figure of Merit for phonetic spoken term detection}, author={Roy Wallace and Robbie Vogt and Brendan Baker and Sridha Sridharan}, journal={2010 IEEE International Conference on Acoustics, Speech and Signal Processing}, year={2010}, pages={5298-5301} }