Towards a Speaker Independent Speech-BCI Using Speaker Adaptation

@inproceedings{Dash2019TowardsAS,
  title={Towards a Speaker Independent Speech-BCI Using Speaker Adaptation},
  author={Debadatta Dash and Alan Wisler and Paul Ferrari and Jun Wang},
  booktitle={INTERSPEECH},
  year={2019}
}
Neurodegenerative diseases such as amyotrophic lateral sclerosis (ALS) can cause locked-in-syndrome (fully paralyzed but aware). Brain-computer interface (BCI) may be the only option to restore their communication. Current BCIs typically use visual or attention correlates in neural activities to select letters randomly displayed on a screen, which are extremely slow (a few words per minute). Speech-BCIs, which aim to convert the brain activity patterns to speech (neural speech decoding), hold… 

Figures from this paper

Neural Speech Decoding for Amyotrophic Lateral Sclerosis

TLDR
Investigation of the decoding of imagined and spoken phrases from non-invasive magnetoencephalography signals of ALS subjects using several spectral features with seven machine learning decoders indicated that the decoding performance for ALS patients is lower than healthy subjects yet significantly higher than chance level.

NeuroVAD: Real-Time Voice Activity Detection from Non-Invasive Neuromagnetic Signals

TLDR
This study attempted to automatically detect the voice/speech activity directly from the neural signals recorded using magnetoencephalography (MEG), and demonstrated the possibility of real-time VAD direct from the non-invasive neural signals with about 88% accuracy.

Decoding Imagined and Spoken Phrases From Non-invasive Neural (MEG) Signals

TLDR
This study investigated the decoding of five imagined and spoken phrases from single-trial, non-invasive magnetoencephalography (MEG) signals collected from eight adult subjects and found convolutional neural networks applied on the spatial, spectral and temporal features extracted from the MEG signals to be highly effective.

Decoding Speech Evoked Jaw Motion from Non-invasive Neuromagnetic Oscillations

TLDR
Experimental results indicated that the jaw kinematics can be successfully decoded from non-invasive neural (MEG) signals.

Neural Speech Decoding with Magnetoencephalography

TLDR
Promising results have been shown with MEG for speech decoding, providing foundation towards future wearable MEG based speech-BCI applications.

Role of Brainwaves in Neural Speech Decoding

Neural speech decoding aims at direct decoding of speech from the brain to restore speech communication in patients with locked-in syndrome (fully paralyzed but aware). Despite the recent progress,

Silent Speech Interfaces for Speech Restoration: A Review

TLDR
A number of challenges remain to be addressed in future research before SSIs can be promoted to real-world applications, and future SSIs will improve the lives of persons with severe speech impairments by restoring their communication capabilities.

Vocal Imagery vs Intention: Viability of Vocal-Based EEG-BCI Paradigms

TLDR
This study clearly shows that VIm can be classified with ease and is a viable paradigm to integrate in BCIs, which are not only useful for people with speech problems, but in general for people who use BCI systems to help them out in their everyday life, giving them another dimension of system control.

Exploring Silent Speech Interfaces Based on Frequency-Modulated Continuous-Wave Radar

TLDR
This article explores the consideration of contactless continuous-wave radar to assess its potential for SSI development, and establishes promising grounds for further exploring this technology towards silent speech recognition.

RaSSpeR: Radar-Based Silent Speech Recognition

TLDR
Continuous-wave radar is explored to assess its potential for SSI, and an encouraging result is established that establishes promising grounds for further exploration of this technology for silent speech recognition.

References

SHOWING 1-10 OF 35 REFERENCES

Towards decoding speech production from single-trial magnetoencephalography (MEG) signals

TLDR
This project investigated the possibility to decode spoken phrases from non-invasive brain activity (MEG) signals and found a direct brain-to-text mapping approach may provide a significantly faster communication rate than current BCIs can provide.

OVERT SPEECH RETRIEVAL FROM NEUROMAGNETIC SIGNALS USING WAVELETS AND ARTIFICIAL NEURAL NETWORKS

TLDR
This study tried to decode overt speech directly from the brain via non-invasive magnetoen-cephalography (MEG) signals to build the foundation for a faster, direct brain to text mapping BCI.

Brain-to-text: decoding spoken phrases from phone representations in the brain

TLDR
It is shown for the first time that continuously spoken speech can be decoded into the expressed words from intracranial electrocorticographic recordings, and this approach contributes to the current understanding of the neural basis of continuous speech production by identifying those cortical regions that hold substantial information about individual phones.

Determining the Optimal Number of MEG Trials: A Machine Learning and Speech Decoding Perspective

TLDR
Wavelet based denoising increased the SNR of the neural signal prior to analysis and facilitated accurate speech decoding performance using as few as 40 single-trials, which may open up the possibility of limiting MEG trials for other task evoked studies as well.

Real-Time Control of an Articulatory-Based Speech Synthesizer for Brain Computer Interfaces

TLDR
It is found that real-time synthesis of vowels and consonants was possible with good intelligibility and open to future speech BCI applications using such articulatory-based speech synthesizer.

Speaker-Independent Silent Speech Recognition From Flesh-Point Articulatory Movements Using an LSTM Neural Network

TLDR
This paper adopts a bidirectional long short-term memory recurrent neural network (BLSTM) as an articulatory model to effectively model the articulatory movements with long-range articulatory history and proposes Procrustes matching-based articulatory normalization by removing locational, rotational, and scaling differences.

Speech synthesis from ECoG using densely connected 3D convolutional neural networks

TLDR
This is the first time that high-quality speech has been reconstructed from neural recordings during speech production using deep neural networks, and uses a densely connected convolutional neural network topology which is well-suited to work with the small amount of data available from each participant.

Decoding spoken words using local field potentials recorded from the cortical surface

TLDR
It is found that the pattern of electrodes with the highest accuracy changed for each word, which supports the idea that closely spaced micro-electrodes are capable of capturing neural signals from independent neural processing assemblies.

Biosignal-Based Spoken Communication: A Survey

TLDR
An overview of the various modalities, research approaches, and objectives for biosignal-based spoken communication is given.

Speech synthesis from neural decoding of spoken sentences

TLDR
A neural decoder that explicitly leverages kinematic and sound representations encoded in human cortical activity to synthesize audible speech, which is readily identified and transcribed by listeners and could synthesize speech when a participant silently mimed sentences.