Learn More
The CHiME challenge series aims to advance far field speech recognition technology by promoting research at the interface of signal processing and automatic speech recognition. This paper presents the design and outcomes of the 3rd CHiME Challenge, which targets the performance of automatic speech recognition in a real-world, commercially-motivated(More)
An audio-visual corpus has been collected to support the use of common material in speech perception and automatic speech recognition studies. The corpus consists of high-quality audio and video recordings of 1000 sentences spoken by each of 34 talkers. Sentences are simple, syntactically identical phrases such as "place green at B 4 now". Intelligibility(More)
Distant microphone speech recognition systems that operate with humanlike robustness remain a distant goal. The key difficulty is that operating in everyday listening conditions entails processing a speech signal that is reverberantly mixed into a noise background composed of multiple competing sound sources. This paper describes a recent speech recognition(More)
Distant-microphone automatic speech recognition (ASR) remains a challenging goal in everyday environments involving multiple background sources and reverberation. This paper is intended to be a reference on the 2nd ’CHiME’ Challenge, an initiative designed to analyze and evaluate the performance of ASR systems in a real-world domestic environment. Two(More)
We present a new corpus designed for noise-robust speech processing research, CHiME. Our goal was to produce material which is both natural (derived from reverberant domestic environments with many simultaneous and unpredictable sound sources) and controlled (providing an enumerated range of SNRs spanning 20 dB). The corpus includes around 40 hours of(More)
In previous work we have developed the theory and demonstrated the promise of the Missing Data approach to robust Automatic Speech Recognition. This technique is based on hard decisions as to whether each time-frequency \pixel" is either reliable or unreliable. In this paper we replace these discrete decisions with soft estimates of the probability that(More)
In this study, techniques for classification with missing or unreliable data are applied to the problem of noise-robustness in Automatic Speech Recognition (ASR). The techniques described make minimal assumptions about any noise background and rely instead on what is known about clean speech. A system is evaluated using the Aurora 2 connected digit(More)
Listeners are remarkably adept at recognising speech that has undergone extensive spectral reduction. Natural speech can be reproduced using as few as three time-varying sinusoids mimicking the corresponding speech formants. Untrained listeners are able to transcribe this `sine-wave' speech with a high degree of reliability. Coherent phonetic percepts(More)
This study compared listeners’ performance on a multispeaker speech-in-noise task with that of a model inspired by automatic speech recognition techniques. Listeners identified three keywords in simple 6-word sentences presented in speech-shaped noise at a range of signal-to-noise ratios. Sentence material was provided by 18 male or 16 female speakers. An(More)
In this study we describe two techniques for handling convolutional distortion with ‘missing data’ speech recognition using spectral features. The missing data approach to automatic speech recognition (ASR) is motivated by a model of human speech perception, and involves the modification of a hidden Markov model (HMM) classifier to deal with missing or(More)