• Publications
  • Influence
An audio-visual corpus for speech perception and automatic speech recognition.
TLDR
An audio-visual corpus that consists of high-quality audio and video recordings of 1000 sentences spoken by each of 34 talkers to support the use of common material in speech perception and automatic speech recognition studies.
The third ‘CHiME’ speech separation and recognition challenge: Dataset, task and baselines
TLDR
The design and outcomes of the 3rd CHiME Challenge, which targets the performance of automatic speech recognition in a real-world, commercially-motivated scenario: a person talking to a tablet device that has been fitted with a six-channel microphone array, are presented.
The fifth 'CHiME' Speech Separation and Recognition Challenge: Dataset, task and baselines
TLDR
The 5th CHiME Challenge is introduced, which considers the task of distant multi-microphone conversational ASR in real home environments and describes the data collection procedure, the task, and the baseline systems for array synchronization, speech enhancement, and conventional and end-to-end ASR.
The foreign language cocktail party problem: Energetic and informational masking effects in non-native speech perception.
TLDR
Both groups drew equal benefit from differences in mean F0 between target and masker, suggesting that processes which make use of this cue do not engage language-specific knowledge.
The second ‘chime’ speech separation and recognition challenge: Datasets, tasks and baselines
TLDR
This paper is intended to be a reference on the 2nd `CHiME' Challenge, an initiative designed to analyze and evaluate the performance of ASR systems in a real-world domestic environment.
Soft decisions in missing data techniques for robust automatic speech recognition
TLDR
The theory and promise of the Missing Data approach to robust Automatic Speech Recognition is developed and the probability calculation is adapted to use these estimates as weighting factors for the complementary reliable/unreliable interpretations for each feature vector component.
The CHiME corpus: a resource and a challenge for computational hearing in multisource environments
TLDR
A new corpus designed for noise-robust speech processing research, CHiME, which includes around 40 hours of background recordings from a head and torso simulator positioned in a domestic setting, and a comprehensive set of binaural impulse responses collected in the same environment.
...
...