• Corpus ID: 28769969

Towards Automatic Mispronunciation Detection in Singing

  title={Towards Automatic Mispronunciation Detection in Singing},
  author={Chitralekha Gupta and David Grunberg and Preeti Rao and Ye Wang},
  booktitle={International Society for Music Information Retrieval Conference},
A tool for automatic pronunciation evaluation of singing is desirable for those learning a second language. However, efforts to obtain pronunciation rules for such a tool have been hindered by a lack of data; while many spokenword datasets exist that can be used in developing the tool, there are relatively few sung-lyrics datasets for such a purpose. In this paper, we demonstrate a proof-of-principle for automatic pronunciation evaluation in singing using a knowledge-based approach with limited… 

Figures and Tables from this paper

Automatic Pronunciation Evaluation of Singing

This work applies singing-adapted automatic speech recognizer (ASR) in a two-stage approach for evaluating pronunciation of singing and shows that the automatic evaluation scheme offers quality scores that are close to human judgments.

Computational Methods for Melody and Voice Processing in Music Recordings (Dagstuhl Seminar 19052)

Current challenges in academic and industrial research in view of the recent advances in deep learning and data-driven models are discussed and novel applications of these technologies in music and multimedia retrieval, content creation, musicology, education, and human-computer interaction are explored.

Musical Aptitude and foreign language receptive pronunciation

There is a growing body of literature that recognises the relationship between musical aptitude and language proficiency. Language is usually segmented into its subcategories when addressed in

Using music technology to motivate foreign language learning

Music is a fun and engaging form of entertainment and is often used by teachers to help students learn languages. In this paper, we describe how recent advances in music technology can be used to



Automatic pronunciation error detection in non-native speech: the case of vowel errors in Dutch.

The results of the two studies show that error patterns bear information that can be usefully employed in weighted automatic measures of pronunciation quality, and it appears that combining such a weighted metric with existing measures improves the equal error rate.

Analysis of English Pronunciation of Singing Voices Sung by Japanese Speakers

It was found that pronunciation scores of the singing voice by singers with singing experience were higher than that of spoken speech, which might mean that the experience of singing improves the skill of English singing.

Automatic scoring of pronunciation quality

Phonetic Rules for Diagnosis of Pronunciation Errors

The paper describes the use of phonetic transcription rules in a component for diagnosis of pronunciation errors produced by students of a foreign language. The module is part of a computer-aided

The Effect of Singing on Improving Syllabic Pronunciation - Vowel Epenthesis in Japanese

The results suggest that at least young adult native speakers in Japanese do have sensitivity to the syllabic structure in English, and this can be reinforced by some specified tasks such as songs of focusing syllable timing.

An Analysis of Pronunciation Errors Made by Indonesian Singers in Malang in Singing English Songs

This study is conducted to find out the pronunciation errors made by Indonesian singers in singing English songs.In collecting the data, the writer used the recorded material of the live performances

Cross-lingual speech recognition under runtime resource constraints

The results show that the AM merging technique performs the best, achieving 60% relative WER reduction over the IPA-based technique.

The NUS sung and spoken lyrics corpus: A quantitative comparison of singing and speech

The NUS Sung and Spoken Lyrics Corpus (NUS-48E corpus) is presented as the first step toward a large, phonetically annotated corpus for singing voice research, and duration analyses of the sung and spoken lyrics are conducted.

TIMIT Acoustic-Phonetic Continuous Speech Corpus

Speech recognition based on phones is very attractive since it is inherently free from vocabulary limitations, but large Vocabulary ASR systems’ performance depends on the quality of the phone recognizer, so research teams continue developing phone recognizers, in order to enhance their performance as much as possible.

Assessing vowel quality for singing evaluation

  • M. JhaP. Rao
  • Physics
    2012 National Conference on Communications (NCC)
  • 2012
Acoustic features combining spectrum envelope and pitch are used with classifiers trained on sung vowels for classification of test vowels segmented from the audio of solo singing.