Man-Hung Siu

Learn More
Tone has a crucial role in Mandarin speech in distinguishing ambiguous words. Most state-of-the-art Mandarin automatic speech recognition systems adopt embedded tone modeling, where tonal acoustic units are used and F0 features are appended to the spectral feature vector. In this paper, we combine the embedded aproach (using improved F0 smoothing) with(More)
Confidence measures enable us to assess the output of a speech recognition system. The confidence measure provides us with an estimate of the probability that a word in the recognizer output is either correct or incorrect. In this paper we discuss ways in which to quantify the performance of confidence measures in terms of their discrimination power and(More)
Lack of data is a problem in training language models for conversational speech recognition, particularly for languages other than English. Experiments in English have successfully used webbased text collection targeted for a conversational style to augment small sets of transcribed speech; here we look at extending these techniques to Mandarin. In(More)
This paper describes our approach to the estimation of con dence in the words generated by a speech recognition system. We describe the models and the features employed for con dence estimation. In addition we discuss the characteristics of an information-theoretic metric for assessing the performance of the con dence measure. We provide a simple(More)
This article describes a methodology for collecting text from the Web to match a target sublanguage both in style (register) and topic. Unlike other work that estimates n-gram statistics from page counts, the approach here is to select and filter documents, which provides more control over the type of material contributing to the n-gram counts. The data can(More)
The introduction of Aurora 4 tasks provides a standard database and methodology for comparing the effectiveness of different robust algorithms on LVCSR. One important issue on Aurora 4 tasks is the computation time involved in evaluating different test conditions. In this paper we show that by employing HTK as the recognition frontend and backend on Aurora(More)
Current state-of-the-art statistical speech recognition systems use hidden Markov models (HMM) for modeling the speech signal. However, it is well known that HMM's do not exploit the time-dependence in the speech process, since they are limited by the assumption of conditional independence of observations given the state sequence. Alternative techniques,(More)
In our previous publication [1], we presented a new approach to HMM training, viz., training without supervision. We used an HMM trained without supervision for transcribing audio into self-organized units (SOUs) for the purpose of topic classification. In this paper we report improvements made to the system, including the use of context dependent acoustic(More)