Learn More
As speech recognition systems are used in ever more applications , it is crucial for the systems to be able to deal with ac-cented speakers. Various techniques, such as acoustic model adaptation and pronunciation adaptation, have been reported to improve the recognition of non-native or accented speech. In this paper, we propose a new approach that combines(More)
We present a pronunciation error detection method for second language learners of English (L2 learners). The method is a combination of confidence scoring and landmark-based Support Vector Machines (SVMs). Landmark-based SVMs were implemented to specialize the method for the specific phonemes with which L2 learners make frequent errors. The method was(More)
This study provides a method that identifies problematic responses which make automated speech scoring difficult. When automated scoring is used in the context of a high stakes language proficiency assessment , for which the scores are used to make consequential decisions, some test takers may have an incentive to try to game the system in order to(More)
In this paper we investigate unsuper-vised name transliteration using comparable corpora, corpora where texts in the two languages deal in some of the same topics — and therefore share references to named entities — but are not translations of each other. We present two distinct methods for transliteration, one approach using an unsupervised phonetic(More)
In this paper we investigate named entity transliteration based on a phonetic scoring method. The phonetic method is computed using phonetic features and carefully designed pseudo features. The proposed method is tested with four languages – Arabic, Chinese, Hindi and Korean – and one source language – English, using comparable corpora. The proposed method(More)
abstract This work reports on the construction of a rated database of spontaneous speech produced by second language (L2) learners of English. Spontaneous speech was collected from 28 L2 speakers representing six language backgrounds and five different proficiency levels. Speech was elicited using formats similar to that of the TOEFL iBT and the SPEAK(More)
This study presents a novel method that measures English language learners' syntactic competence towards improving automated speech scoring systems. In contrast to most previous studies which focus on the length of production units such as the mean length of clauses, we focused on capturing the differences in the distribution of morpho-syntactic features or(More)