Machine translation of cortical activity to text with an encoder–decoder framework

@article{Makin2020MachineTO,
  title={Machine translation of cortical activity to text with an encoder–decoder framework},
  author={Joseph G. Makin and David A. Moses and Edward F. Chang},
  journal={Nature Neuroscience},
  year={2020},
  volume={23},
  pages={575-582}
}
A decade after speech was first decoded from human brain signals, accuracy and speed remain far below that of natural speech. Here we show how to decode the electrocorticogram with high accuracy and at natural-speech rates. Taking a cue from recent advances in machine translation, we train a recurrent neural network to encode each sentence-length sequence of neural activity into an abstract representation, and then to decode this representation, word by word, into an English sentence. For each… 

Brain2Char: a deep architecture for decoding text from brain recordings

TLDR
These results establish a new end-to-end approach on decoding text from brain signals and demonstrate the potential of Brain2Char as a high-performance communication BCI.

Decoding spoken English from intracortical electrode arrays in dorsal precentral gyrus

TLDR
The ability to decode speech using intracortical electrode array signals from a nontraditional speech area suggests that placing electrode arrays in ventral speech areas is a promising direction for speech BCIs.

A dual‐channel language decoding from brain activity with progressive transfer training

TLDR
A dual-channel language decoding model (DC-LDM) is built to decode the neural activities evoked by images into language (phrases or short sentences) and it is found that Word2vec-Cosine similarity (WCS) was the best indicator to reflect the similarity between the decoded and the annotated texts.

Imagined speech can be decoded from low- and cross-frequency features in perceptual space

TLDR
It is demonstrated that low-frequency power and cross-frequency dynamics contain key information for imagined speech decoding, and that exploring perceptual spaces offers a promising avenue for future imagined speech BCIs.

Decoding spoken English phonemes from intracortical electrode arrays in dorsal precentral gyrus

TLDR
The ability to decode a comprehensive set of phonemes using intracortical electrode array signals from a nontraditional speech area suggests that placing electrode arrays in ventral speech areas is a promising direction for speech BCIs.

Imagined speech can be decoded from low- and cross-frequency intracranial EEG features

TLDR
It is demonstrated using human intracranial recordings that both low- and higher-frequency power and local cross-frequency contribute to imagined speech decoding, in particular in phonetic and vocalic spaces.

Towards Naturalistic Speech Decoding from Intracranial Brain Data

TLDR
A novel approach to speech decoding that relies on a generative adversarial neural network (GAN) to reconstruct speech from brain data recorded during a naturalistic speech listening task (watching a movie) is described.

Machine learning algorithm for decoding multiple subthalamic spike trains for speech brain–machine interfaces

TLDR
This study demonstrates that the information encoded by single neurons in the STN about the production, perception and imagery of speech is suitable for high-accuracy decoding, an important step towards BMIs for restoration of speech faculties that bears an enormous potential to alleviate the suffering of completely paralyzed patients and allow them to communicate again with their environment.

Deep learning approaches for neural decoding across architectures and recording modalities

TLDR
The architectures used for extracting useful features from neural recording modalities ranging from spikes to functional magnetic resonance imaging are described and points out areas for future scientific development are pointed out.

Neuroprosthesis for Decoding Speech in a Paralyzed Person with Anarthria.

TLDR
In a person with anarthria and spastic quadriparesis caused by a brain-stem stroke, words and sentences were decoded directly from cortical activity during attempted speech with the use of deep-learning models and a natural-language model.
...

References

SHOWING 1-10 OF 58 REFERENCES

Brain-to-text: decoding spoken phrases from phone representations in the brain

TLDR
It is shown for the first time that continuously spoken speech can be decoded into the expressed words from intracranial electrocorticographic recordings, and this approach contributes to the current understanding of the neural basis of continuous speech production by identifying those cortical regions that hold substantial information about individual phones.

Real-time decoding of question-and-answer speech dialogue using human cortical activity

TLDR
It is demonstrated that the context of a verbal exchange can be used to enhance neural decoder performance in real time and Contextual integration of decoded question likelihoods significantly improves answer decoding.

Decoding Speech from Intracortical Multielectrode Arrays in Dorsal “Arm/Hand Areas” of Human Motor Cortex

TLDR
Recorded from two 96- electrode arrays chronically implanted into the ‘hand knob’ area of motor cortex while a person with tetraplegia spoke, this suggests that high-fidelity speech prostheses may be possible using large-scale intracortical recordings in motor cortical areas involved in controlling speech articulators.

Phonetic Feature Encoding in Human Superior Temporal Gyrus

TLDR
High-density direct cortical surface recordings in humans while they listened to natural, continuous speech were used to reveal the STG representation of the entire English phonetic inventory, demonstrating the acoustic-phonetic representation of speech in human STG.

Functional Organization of Human Sensorimotor Cortex for Speech Articulation

TLDR
High-resolution, multi-electrode cortical recordings during the production of consonant-vowel syllables reveal the dynamic organization of speech sensorimotor cortex during the generation of multi-articulator movements that underlies the ability to speak.

Classification of Intended Phoneme Production from Chronic Intracortical Microelectrode Recordings in Speech-Motor Cortex

TLDR
Preliminary results suggest supervised classification techniques are capable of performing large scale multi-class discrimination for attempted speech production and may provide the basis for future communication prostheses.

On the Properties of Neural Machine Translation: Encoder–Decoder Approaches

TLDR
It is shown that the neural machine translation performs relatively well on short sentences without unknown words, but its performance degrades rapidly as the length of the sentence and the number of unknown words increase.

Differential Representation of Articulatory Gestures and Phonemes in Precentral and Inferior Frontal Gyri

TLDR
This work investigates the cortical representation of articulatory gestures and phonemes in ventral precentral and inferior frontal gyri in men and women and suggests that speech production shares a common cortical representation with that of other types of movement, such as arm and hand movements.

Decoding vowels and consonants in spoken and imagined words using electrocorticographic signals in humans

TLDR
It is found that it is possible to use signals recorded from the surface of the brain (electrocorticography) to discriminate the vowels and consonants embedded in spoken and in imagined words, and the cortical areas that held the most information about discrimination of vowelsand consonants are defined.
...