Learn More
At a cocktail party, one can selectively attend to a single voice and filter out all the other acoustical interferences. How to simulate this perceptual ability remains a great challenge. This paper describes a novel, supervised learning approach to speech segregation, in which a target speech signal is separated from interfering sounds using spatial(More)
A multistage neural model is proposed for an auditory scene analysis task--segregating speech from interfering sound sources. The core of the model is a two-layer oscillator network that performs stream segregation on the basis of oscillatory correlation. In the oscillatory correlation framework, a stream is represented by a population of synchronized(More)
In this study we describe a binaural auditory model for recognition of speech in the presence of spatially separated noise intrusions, under small-room reverberation conditions. The principle underlying the model is to identify time–fre-quency regions which constitute reliable evidence of the speech signal. This is achieved both by determining the spatial(More)
This paper describes a perceptually motivated computational auditory scene analysis (CASA) system that combines sound separation according to spatial location with the "missing data" approach for robust speech recognition in noise. Missing data time-frequency masks are created using probability distributions based on estimates of interaural time and level(More)
The analysis of scenarios in which a number of microphones record the activity of speakers, such as in a round-table meeting, presents a number of computational challenges. For example, if each participant wears a microphone, speech from both the microphone's wearer (local speech) and from other participants (crosstalk) is received. The recorded audio can(More)
In this study we describe two techniques for handling convolutional distortion with 'missing data' speech recognition using spectral features. The missing data approach to automatic speech recognition (ASR) is motivated by a model of human speech perception, and involves the modification of a hidden Markov model (HMM) classifier to deal with missing or(More)
A challenging problem for research in computational auditory scene analysis is the integration of evidence derived from multiple grouping principles. We describe a computational model which addresses this issue through the use of a `blackboard' architecture. The model integrates evidence from multiple grouping principles at several levels of abstraction ,(More)
The neural mechanisms underlying the ability of human listeners to recognize speech in the presence of background noise are still imperfectly understood. However, there is mounting evidence that the medial olivocochlear system plays an important role, via efferents that exert a suppressive effect on the response of the basilar membrane. The current paper(More)