Learn More
Tone has a crucial role in Mandarin speech in distinguishing ambiguous words. Most state-of-the-art Mandarin automatic speech recognition systems adopt embedded tone modeling, where tonal acoustic units are used and F 0 features are appended to the spectral feature vector. In this paper, we combine the embedded aproach (using improved F 0 smoothing) with(More)
This paper describes the development of CU Corpora, a series of large-scale speech corpora for Cantonese. Can-tonese is the most commonly spoken Chinese dialect in Southern China and Hong Kong. CU Corpora are the first of their kind and intended to serve as an important infrastructure for the advancement of speech recognition and synthesis technologies for(More)
This letter describes a speaker verification system that uses complementary acoustic features derived from the vocal source excitation and the vocal tract system. A new feature set, named the wavelet octave coefficients of residues (WOCOR), is proposed to capture the spectro-temporal source excitation characteristics embedded in the linear predictive(More)
In Chinese languages, tones carry important information at various linguistic levels. This research is based on the belief that tone information, if acquired accurately and utilized effectively, contributes to the automatic speech recognition of Chinese. In particular, we focus on the Cantonese dialect, which is spoken by tens of millions of people in(More)
The framework of posteriorgram-based template matching has been shown to be successful for query-by-example spoken term detection (STD). This framework employs a tokenizer to convert query examples and test utterances into frame-level posteriorgrams, and applies dynamic time warping to match the query posteriorgrams with test posteriorgrams to locate(More)
In this paper, we present an effective method to detect the language boundary (LB) in code-switching utterances. The utterances are mainly produced in Cantonese, a commonly used Chinese dialect, whilst occasionally English words are inserted between Cantonese words. Bi-phone probabilities are calculated to measure the confidence that the recognized phones(More)
Cantonese is a major Chinese dialect with a complicated tone system. This research focuses on quantitative modeling of Cantonese tones. It uses Stem-ML, a language-independent framework for quantitative intonation modeling and generation. A set of F 0 prediction models are built, and trained on acoustic data. The prediction error is about 11 Hz or 1(More)
Lhermitte-Duclos disease (LDD) is a rare benign lesion of uncertain pathogenesis characterized by distortion of the normal cerebellar laminar cytoarchitecture. We report a 22-year-old man admitted for injury sustained in a traffic accident with the incidental finding of a cerebellar mass. Magnetic resonance imaging (MRI) revealed a mass lesion within the(More)
Recently the posteriorgram-based template matching framework has been successfully applied to query-by-example spoken term detection tasks for low-resource languages. This framework employs a tokenizer to derive posteriorgrams, and applies dynamic time warping (DTW) to the posteriorgrams to locate the possible occurrences of a query term. Based on this(More)