Christopher M. White

Learn More
This paper examines a query-by-example approach to spoken term detection in audio files. The approach is designed for low-resource situations in which limited or no in-domain training material is available and accurate word-based speech recognition capability is unavailable. Instead of using word or phone strings as search terms, the user presents the(More)
In this paper we examine an alternative interface for phonetic search, namely query-by-example, that avoids OOV issues associated with both standard word-based and phonetic search methods. We develop three methods that compare query lattices derived from example audio against a standard ngrambased phonetic index and we analyze factors affecting the(More)
This paper details the development of a Hybrid Evolutionary Algorithm for solving the Traveling Salesman Problem (TSP). The strategy of the algorithm is to complement and extend the successful results of a genetic algorithm (GA) using a distance preserving crossover (DPX) by incorporating memory in the form of ant pheromone during the city selection(More)
The spoken term detection (STD) task aims to return relevant segments from a spoken archive that contain the query terms whether or not they are in the system vocabulary. This paper focuses on pronunciation modeling for Out-of-Vocabulary (OOV) terms which frequently occur in STD queries. The STD system described in this paper indexes word-level and sub-word(More)
This paper addresses the detection of OOV segments in the output of a large vocabulary continuous speech recognition (LVCSR) system. First, standard confidence measures from frame-based word- and phone-posteriors are investigated. Substantial improvement is obtained when posteriors from two systems - strongly constrained (LVCSR) and weakly constrained(More)
Automatic speech recognition (ASR) systems continue to make errors during search when handling various phenomena including noise, pronunciation variation, and out of vocabulary (OOV) words. Predicting the probability that a word is incorrect can prevent the error from propagating and perhaps allow the system to recover. This paper addresses the problem of(More)
Data traversing packet networks experience varying delays, resulting in inter-arrival jitter. This can result in degraded performance in real-time multimedia communications applications if the jitter delays are large or unaccounted for in the receiver application. This paper examines modeling and simulation of network jitter delay for real-time multimedia(More)
Indexing and retrieval of speech content in various forms such as broadcast news, customer care data and on-line media has gained a lot of interest for a wide range of applications, from customer analytics to on-line media search. For most retrieval applications, the speech content is typically first converted to a lexical or phonetic representation using(More)
USING WORD GRAPHS M. P. Harper, M. T. Johnson, L. H. Jamieson, S. A. Hockema, and C. M. White Purdue University, School of Electrical and Computer Engineering West Lafayette, IN 47907 fharper,mjohnson,lhj,hockema, ABSTRACT In this paper, we describe a prototype spoken language system that loosely integrates a speech recognition(More)