Learn More
In recent years, many alternative models have been proposed to address some of the shortcomings of the hidden Markov model, currently the most popular approach to speech recognition. In particular, a variety of models that could be broadly classiied as segment models have been described for representing a variable-length sequence of observation vectors in(More)
In addition to ordinary words and names, real text contains non-standard " words " (NSWs), including numbers, abbreviations, dates, currency amounts and acronyms. Typically, one cannot find NSWs in a dictionary, nor can one find their pronunciation by an application of ordinary " letter-to-sound " rules. Non-standard words also have a greater propensity(More)
Reading proficiency is a fundamental component of language competency. However, finding topical texts at an appropriate reading level for foreign and second language learners is a challenge for teachers. This task can be addressed with natural language processing technology to assess reading level. Existing measures of reading level are not well suited to(More)
Prosodic phrase structure provides important information for the understanding and naturalness of synthetic speech, and a good model of prosodic phrases has applications in both speech synthesis and speech understanding. This work describes a statistical model of an embedded hierarchy of prosodic phrase structure, motivated by results in linguistic theory.(More)
Tone has a crucial role in Mandarin speech in distinguishing ambiguous words. Most state-of-the-art Mandarin automatic speech recognition systems adopt embedded tone modeling, where tonal acoustic units are used and F 0 features are appended to the spectral feature vector. In this paper, we combine the embedded aproach (using improved F 0 smoothing) with(More)
— Effective human and automatic processing of speech requires recovery of more than just the words. It also involves recovering phenomena such as sentence boundaries, filler words, and disfluencies, referred to as structural metadata. We describe a metadata detection system that combines information from different types of textual knowledge sources with(More)