Corpus ID: 235446759

Detection of Consonant Errors in Disordered Speech Based on Consonant-vowel Segment Embedding

  title={Detection of Consonant Errors in Disordered Speech Based on Consonant-vowel Segment Embedding},
  author={Si-Ioi Ng and C. Ng and Jingyu Li and Tan Lee},
Speech sound disorder (SSD) refers to a type of developmental disorder in young children who encounter persistent difficulties in producing certain speech sounds at the expected age. Consonant errors are the major indicator of SSD in clinical assessment. Previous studies on automatic assessment of SSD revealed that detection of speech errors concerning short and transitory consonants is less satisfactory. This paper investigates a neural network based approach to detecting consonant errors in… Expand

Figures and Tables from this paper


Automatic Detection of Phonological Errors in Child Speech Using Siamese Recurrent Autoencoder
A study on automatic detection of phonological errors in Cantonese speech of kindergarten children, based on a newly collected large speech corpus, using a Siamese recurrent autoencoder trained to learn the similarity and discrepancy between phone segments in the embedding space. Expand
Child Speech Disorder Detection with Siamese Recurrent Network Using Speech Attribute Features
A novel design of child speech disorder detection system that requires only normal speech for model training, based on a Siamese recurrent network, which is trained to learn the similarity and discrepancy of pronunciations between a pair of phones in the embedding space. Expand
Automated Screening of Speech Development Issues in Children by Identifying Phonological Error Patterns
Results indicate the proposed system is viable and the direction of further development are outlined in the paper, including identification of Phonological Error Patterns with up to 94% accuracy. Expand
Predicting Clinical Evaluations of Children's Speech with Limited Data Using Exemplar Word Template References
Multiple linear regression on the difference scores was shown to be effective at producing predictions that were well-correlated with human clinical evaluations and how consistent the clinicians were at scoring children’s speech production. Expand
Automatic analysis of pronunciations for children with speech sound disorders
It is found that pronunciation models that use explicit knowledge about error pronunciation patterns can lead to more accurate classification whether a phoneme was correctly pronounced or not, and this paper proposes two new GOP techniques. Expand
CUCHILD: A Large-Scale Cantonese Corpus of Child Speech for Phonology and Articulation Assessment
The design and development of CUCHILD, a large-scale Cantonese corpus of child speech, is described, including selection of words, participants recruitment, data acquisition process, and data pre-processing are described in detail. Expand
A comparison of GMM-HMM and DNN-HMM based pronunciation verification techniques for use in the assessment of childhood apraxia of speech
A pronunciation verification method to be used in an automatic assessment therapy tool of child disordered speech that creates a phonebased search lattice that is flexible enough to cover all probable mispronunciations. Expand
Automatic Prediction of Speech Intelligibility Based on X-Vectors in the Context of Head and Neck Cancer
This paper investigates an automatic prediction of speech intelligibility using the x-vector paradigm, in the context of head and neck cancer, and suggests a high correlation rate and the possibility of achieving very high correlation values. Expand
Transfer Learning from Adult to Children for Speech Recognition: Evaluation, Analysis and Recommendations
This work attempts to address the key challenges using transfer learning from adult's models to children's models in a Deep Neural Network (DNN) framework for children's Automatic Speech Recognition (ASR) task evaluating on multiple children's speech corpora with a large vocabulary. Expand
Improving DNN-Based Automatic Recognition of Non-native Children Speech with Adult Speech
The experimental results show that the best recognition performance can be achieved by combining children's training data with adult training data of approximately the same size and initializing the DNN with the weights obtained by pre-training using the full training set of the adult corpus. Expand