Wai-Kim Leung

Learn More
— We have developed distributed text-to-audiovisual-speech synthesizer (TTAVS) to support interactivity in computer-aided pronunciation training (CAPT) on a mobile platform. The TTAVS serves to generate audiovisual corrective feedback based on detected mispronunciations from the second language learner's speech. Our approach encodes key visemes in SVG(More)
This paper presents our group's ongoing research in the area of computer-aided pronunciation training (CAPT) for Chinese learners of English. Our goal is to develop technologies in automatic speech recognition (ASR) to support productive training for learners. We focus on modeling possible errors due to negative transfer from L1 (i.e. Chinese) of Chinese(More)
This paper presents a two-dimensional (2D) visual-speech synthesizer to support language learning. A visual-speech synthesizer animates the human articulators in synchronization with speech signals, e.g., output from a text-to-speech synthesizer. A visual-speech animation can offer a concrete illustration to the language learners on how to move and where to(More)
In second language learning, unawareness of the differences between correct and incorrect pronunciations is one of the largest obstacles for mispronunciation correction. In order to make the feedback more discriminatively perceptible, this paper presents a novel method for corrective feedback generation, namely, exaggerated feedback, for language learning.(More)
—This paper presents our group's latest progress in developing Enunciate — an online computer-aided pronunciation training (CAPT) system for Chinese learners of English. Presently, the system targets segmental pronunciation errors. It consists of an audio-enabled web interface, a speech recognizer for mispronunciation detection and diagnosis, a speech(More)
Computer-aided pronunciation training (CAPT) technologies enable the use of automatic speech recognition to detect mispronunciations in second language (L2) learners’ speech. In order to further facilitate learning, we aim to develop a principle-based method for generating a gradation of the severity of mispronunciations. This paper presents an approach(More)
This paper presents our ongoing research in the field of speech-enabled multimodal, mobile application development. We have developed a multimodal framework that enables cross-platform development using open standards-based HTML, CSS and JavaScript. This framework brings high extendibility through plugin-based architecture and provides scalable REST-based(More)
We are developing a voice-enabled, mobile and customizable electronic communication book (e-Commu-Book) as an aid for people with speech impairment. When a user touches an image icon on the e-Commu-Book, the caption under the image icon will be read out by speech synthesis. The caregivers of the users can help customize contents to fit the users’ needs. The(More)
This paper presents our ongoing research in the field of audiometry application development. There are many researches on the development of audiometry application but they are all platform specific. In this paper, we developed a new multimodal framework that enabled cross-platform development by using open standards such as HTML5, CSS5 and JavaScript. We(More)
Computer-aided Pronunciation Training (CAPT) technologies enable the use of automatic speech recognition to detect mispronunciations in second language (L2) learners' speech. In order to further facilitate learning, we aim to be able to develop a principle-based method for generating a gradation of the severity of mispronunciations. This paper presents an(More)