Learn More
In this paper we describe the data collection for the TBALL project (Technology Based Assessment of Language and Literacy) and report the results of our efforts. We focus on aspects of our corpus that distinguish it from currently available corpora. The speakers are children (grades K-4), largely non-native speakers of English, and from diverse(More)
Arguably the most important part of automatically assessing a new reader's literacy is in verifying his pronunciation of read-aloud target words. But the pronunciation evaluation task is especially difficult in children, non-native speakers, and pre-literates. Traditional likelihood ratio thresholding methods do not generalize easily, and even expert human(More)
Most speech processing algorithms analyze speech signals frame by frame with a fixed frame rate. Fixed-rate analysis is inconsistent with human speech perception and effectively assigns the same importance or 'weight' to all equi-duration frames. In Zhu et al. (2000), we proposed a variable frame rate (VFR) analysis technique that is based on a Euclidian(More)
When learning to speak English, non-native speakers may pronounce some English phonemes differently from native speakers. These pronunciation variations can degrade an automatic speech recognition system's performance on accented English. This paper is a first attempt to find common pronunciation variations in Spanish-accented English as spoken by young(More)
This study examines methods for recognizing different classes of phones from accented speech based on voice onset time (VOT). These methods are tested on data from the Tball corpus of Los Angeles-area elementary school children [1]. The methods proposed and tested are: 1) to train models based on standard English VOT contrasts and then extract the VOT(More)
Financial derivatives commonly contain premature termination clauses, which are embedded rights held by the holder or writer. Well known examples of these stopping rights include the early exercise right in American options, the callable right in callable securities and the prepayment right in mortgage loans. In this paper, we show how to model the(More)
In this paper, we analyze the temporal modulation characteristics of speech and noise from a speech/non-speech discrimination point of view. Although previous psychoacous-tic studies [3][10] have shown that low temporal modulation components are important for speech intelligibility, there is no reported analysis on modulation components from the point of(More)
With the wide application of hidden Markov models (HMMs) in speech recognition, a statistical acoustic confusability metric is of increasing importance to many components of a speech recognition system. Although distance metrics between HMMs have been studied in the past, they didn't include a way of accounting for speaking rate and durational variations.(More)