16 Separation of Speech by Computational Auditory Scene Analysis

Abstract

The term auditory scene analysis (ASA) refers to the ability of human listeners to form perceptual representations of the constituent sources in an acoustic mixture, as in the well-known ‘cocktail party’ effect. Accordingly, computational auditory scene analysis (CASA) is the field of study which attempts to replicate ASA in machines. Some CASA systems are closely modelled on the known stages of auditory processing, whereas others adopt a more functional approach. However, all are broadly based on the principles underlying the perception and organisation of sound by human listeners, and in this respect they differ from ICA and other approaches to sound separation. In this paper, we review the principles underlying ASA and show how they can be implemented in CASA systems. We also consider the link between CASA and automatic speech recognition, and draw distinctions between the CASA and ICA approaches.

7 Figures and Tables

Statistics

01020'05'06'07'08'09'10'11'12'13'14'15'16'17
Citations per Year

91 Citations

Semantic Scholar estimates that this publication has 91 citations based on the available data.

See our FAQ for additional information.

Cite this paper

@inproceedings{Brown200516SO, title={16 Separation of Speech by Computational Auditory Scene Analysis}, author={Guy J. Brown and DeLiang Wang}, year={2005} }