Learn More
The task of keyword spotting is to detect a set of keywords in the input continuous speech. The main goal of this work is to develop an improved Mandarin keyword spotting (KWS) system for conversational telephone speech (CTS). In this paper, we propose an efficient online-garbage model based KWS system, which integrated with a word-level minimum(More)
Since it is the most natural way for people to search a specific melody in large music database, query by humming/singing is attracting more and more researcherspsila attention in the field of content-based music information retrieval. In this task, note-based and frame-based similarity measures are two commonly used methods. However, in previous works,(More)
Automatic evaluation of GOR (Goodness Of pRosody) is a more advanced and challenging task in CALL (Computer Aided Language Learning) system. Apart from traditional prosodic features, we develop a method based on multiple knowledge sources without any prior condition of reading text. After speech recognition, apart from most state-of-the-art features in(More)
Mispronunciation detection is an important component in computer assisted language learning (CALL) system. In this work, we introduce an efficient GLDS-SVM based detection method, which is successfully used in language and speaker identification systems, and combine it with traditional methods. The main ideas include: extended MFCC features with normalized(More)
Lightly supervised acoustic model training has been recognized as an effective way to improve acoustic model training for broadcast news recognition. In this paper, a new approach is introduced to both fully utilize the un-transcribed data by using closed captions as transcripts and to select more informative data for acoustic model training. We will show(More)
Keyword spotting (KWS) is an essential technique for speech information retrieval. When doing offline keyword query on large volume spontaneous speech data, fast and accurate KWS methods are required. In this paper, a novel phone-state matrix based vocabulary-independent KWS method is proposed, which has merits of both hidden Markov model (HMM) based and(More)
Recently a new language model, the random forest language model (RFLM), has been proposed and shown encouraging results in speech recognition tasks. In this paper we applied the RFLM to language identification tasks. We proposed a shared backoff smoothing to deal with data sparseness problem. Experiments were conducted on a subset of NIST 2003 language(More)
In the task of mispronunciation detection, the cross-speaker degradation and some other confusing nuisances are the challenging problems demanding prompt solution. In this paper, we will attempt to remove the non-pronunciation variations in the GLDS-SVM expansion space by using nuisance attribute projection strategy, in order to increase the separating(More)