Corpus ID: 235790319

Machine Learning for Stuttering Identification: Review, Challenges & Future Directions

@article{Sheikh2021MachineLF,
  title={Machine Learning for Stuttering Identification: Review, Challenges \& Future Directions},
  author={Shakeel Ahmad Sheikh and Md. Sahidullah and Fabrice Hirsch and Slim Ouni},
  journal={ArXiv},
  year={2021},
  volume={abs/2107.04057}
}
Stuttering is a speech disorder during which the flow of speech is interrupted by involuntary pauses and repetition of sounds. Stuttering identification is an interesting interdisciplinary domain research problem which involves pathology, psychology, acoustics, and signal processing that makes it hard and complicated to detect. Recent developments in machine and deep learning have dramatically revolutionized speech domain, however minimal attention has been given to stuttering identification… Expand

Figures and Tables from this paper

References

SHOWING 1-10 OF 132 REFERENCES
StutterNet: Stuttering Detection Using Time Delay Neural Network
TLDR
StutterNet is introduced, a novel deep learning based stuttering detection capable of detecting and identifying various types of disfluencies and outperforms the state-ofthe-art residual neural network based method. Expand
Detecting Multiple Speech Disfluencies Using a Deep Residual Network with Bidirectional Long Short-Term Memory
TLDR
This work proposes a model that relies solely on acoustic features, allowing for identification of several variations of stutter disfluencies without the need for speech recognition, outperforming the state-of-the-art by almost 27%. Expand
A Dynamic, Self Supervised, Large Scale AudioVisual Dataset for Stuttered Speech
TLDR
An end-to-end, real-time, multi-modal model for detection and classification of stuttered blocks in unbound speech, which uses multiple modalities as acoustic signals together with secondary characteristics exhibited in visual signals will permit an increased accuracy of detection. Expand
Overview of Automatic Stuttering Recognition System
Stuttering is a speech disorder. The flow of speech is disrupted by involuntary repetitions and prolongation of sounds, syllables, words or phrases, and involuntary silent pauses or blocks inExpand
Sequence labeling to detect stuttering events in read speech
TLDR
Evaluating the ability of two machine learning approaches, namely conditional random fields (CRF) and bi-directional long-short-term memory (BLSTM), to detect stuttering events in transcriptions of stuttering speech finds that adding more data to train the CRF and BLSTM classifiers consistently improves the results. Expand
Automatic detection of prolongations and repetitions using LPCC
TLDR
This work describes particular stuttering events to be located as repetitions and prolongations in stuttered speech with feature extraction algorithm and shows that the LPCC and classifier can be used for the recognition of repetition and prolongation in stuttering speech with the best accuracy. Expand
A Comparative Study of the Techniques for Feature Extraction and Classification in Stuttering
Disability in speech concerns many other communication problems such as hearing, and fluency. Stuttering is a neurodevelopmental disorder identified by the existence of dysfluencies during speechExpand
Automatic dysfluency detection in dysarthric speech using deep belief networks
TLDR
Different types of input features used by deep neural networks (DNNs) to automatically detect repetition stuttering and non-speech dysfluencies within dysarthric speech are investigated. Expand
A Lightly Supervised Approach to Detect Stuttering in Children's Speech
TLDR
This work uses a lightly-supervised approach using task-oriented lattices to recognise the stuttering speech of children performing a standard reading task, and proposes a training regime to address this problem, and preserve a full verbatim output of stuttered speech. Expand
Hierarchical ANN system for stuttering identification
TLDR
Various types of MLP networks were examined with respect to their ability to classify utterances correctly into two, non-fluent and fluent, groups and classification correctness exceeded 84-100% depending on the disfluency type. Expand
...
1
2
3
4
5
...