Nonverbal Sound Detection for Disordered Speech

  title={Nonverbal Sound Detection for Disordered Speech},
  author={Colin S. Lea and Zifang Huang and Dhruv Jain and Lauren Tooley and Zeinab Liaghat and Shrinath Thelapurath and Leah Findlater and Jeffrey P. Bigham},
Voice assistants have become an essential tool for people with various disabilities because they enable complex phone-or tablet-based interactions without the need for fine-grained motor control, such as with touchscreens. However, these systems are not tuned for the unique characteristics of individuals with speech disorders, including many of those who have a motor-speech disorder, are deaf or hard of hearing, have a severe stutter, or are minimally verbal. We introduce an alternative voice… 

Figures and Tables from this paper


Pseudo-Label : The Simple and Efficient Semi-Supervised Learning Method for Deep Neural Networks
This simple and efficient method of semi-supervised learning for deep neural networks is proposed, trained in a supervised fashion with labeled and unlabeled data simultaneously and favors a low-density separation between classes.
Automatic Speech Recognition of Disordered Speech: Personalized Models Outperforming Human Listeners on Short Phrases
This study evaluated the accuracy of personalized automatic speech recognition (ASR) for recognizing disordered speech from a large cohort of individuals with a wide range of underlying etiologies
A Voice-Activated Switch for Persons with Motor and Speech Impairments: Isolated-Vowel Spotting Using Neural Networks
Severe speech impairments limit the precision and range of producible speech sounds. As a result, generic automatic speech recognition (ASR) and keyword spotting (KWS) systems fail to accurately
Synthesis of New Words for Improved Dysarthric Speech Recognition on an Expanded Vocabulary
This paper proposes a data augmentation method using voice conversion that allows dysarthric ASR systems to accurately recognize words outside of the training set vocabulary, and demonstrates that it’s possible to synthesize utterances of new words that were never recorded by speakers with dysarthria.
Few-Shot Keyword Spotting in Any Language
A few-shot transfer learning method for keyword spotting in any language and investigating streaming accuracy for 5-shot models in two contexts: keyword spotting and keyword search.
Sense and Accessibility: Understanding People with Physical Disabilities’ Experiences with Sensing Systems
Findings regarding the many challenges status quo sensing systems present for people with physical disabilities are presented, as well as the ways in which participants responded to these challenges.
learn2learn: A Library for Meta-Learning Research
Meta-learning researchers face two fundamental issues in their empirical work: prototyping and reproducibility. Researchers are prone to make mistakes when prototyping new algorithms and tasks
Non-Verbal Auditory Input for Controlling Binary, Discrete, and Continuous Input in Automotive User Interfaces
The results reveal that, although clapping hands for making input was initially preferred in an online survey, it is snapping fingers for binary input and discrete input and humming for making continuous input that is the preferred NVAI modality while driving.
Quartznet: Deep Automatic Speech Recognition with 1D Time-Channel Separable Convolutions
A new end-to-end neural acoustic model for automatic speech recognition that achieves near state-of-the-art accuracy on LibriSpeech and Wall Street Journal, while having fewer parameters than all competing models.
Vowel-Specific Intelligibility and Acoustic Patterns in Individuals With Dysarthria Secondary to Amyotrophic Lateral Sclerosis.
Vowel-specific intelligibility patterns are different across severity groups; particularly, low intelligibility of /ɪ/ was noted in individuals with severe dysarthria, which is related to both vowel-specific characteristics and group-specific articulatory control dysfunction.