Prediction-driven computational auditory scene analysis

@inproceedings{Ellis1996PredictiondrivenCA,
  title={Prediction-driven computational auditory scene analysis},
  author={Daniel P. W. Ellis},
  year={1996}
}
The sound of a busy environment, such as a city street, gives rise to a perception of numerous distinct events in a human listener--the 'auditory scene analysis' of the acoustic information. Recent advances in the understanding of this process from experimental psychoacoustics have led to several efforts to build a computer model capable of the same function. This work is known as 'computational auditory scene analysis'. The dominant approach to this problem has been as a sequence of modules… 
A computational model of auditory selective attention
TLDR
This paper presents a conceptual framework for auditory selective attention in which the formation of groups and streams is heavily influenced by conscious and subconscious attention, and is implemented as a computational model comprising a network of neural oscillators, which perform stream segregation on the basis of oscillatory correlation.
A Theory and Computational Model of Auditory Selective Attention
TLDR
A conceptual framework for auditory selective attention is presented in which the formation of groups and streams is heavily influenced by conscious and subconscious attention and the model prediction that attention is subject to a form of ‘reset’ when the attentional focus is moved in space is supported.
Sound-source recognition: a theory and computational model
TLDR
A computer model of the recognition process is developed that is capable of “listening” to a recording of a musical instrument and classifying the instrument as one of 25 possibilities, based on current models of signal processing in the human auditory system.
Separation of Speech by Computational Auditory Scene Analysis
TLDR
This chapter reviews the principles underlying ASA and shows how they can be implemented in CASA systems, and considers the link between CASA and automatic speech recognition, and draws distinctions between the CASa and ICA approaches.
PREDICTION-DRIVEN COMPUTATIONAL AUDITORY SCENE ANALYSIS FOR DENSE SOUND MIXTURES
TLDR
This work proposes a prediction-driven approach to audition, raising numerous issues including the need to represent any kind of sound, and to handle multiple competing hypotheses, and results from an implementation illustrate its ability to analyze complex, ambient sound scenes that would confound previous systems.
Computational auditory scene induction
TLDR
This thesis focuses on the computational realization of auditory scene induction — Computational Auditory Scene Induction (CASI), and aims to build computer models that are capable of resynthesizing the missing information of an audio scene.
TOWARD AUTOMATIC SOUND SOURCE RECOGNITION: IDENTIFYING MUSICAL INSTRUMENTS
One of the broad goals of research in computational auditory scene analysis (CASA) is to create computer systems that can learn to recognize sound sources in a complex auditory environment. In this
Disambiguating Sounds through Context
TLDR
It is shown that the use of knowledge in a dynamic network model can improve automatic sound identification, by reducing the search space of the low-level audio features.
...
...

References

SHOWING 1-10 OF 148 REFERENCES
Computational auditory scene analysis : a representational approach
This thesis addresses the problem of how a listener groups together acoustic components which have arisen from the same environmental event, a phenomenon known as auditory scene analysis. A
Residue-Driven Architecture for Computational Auditory Scene Analysis
TLDR
The Residue-Driven Architecture presented here is a model of auditory stream segregation from input sounds that can segregate a woman's voiced stream, a man's voicing stream, and a noise stream from a mixture of these sounds.
A PERCEPTUAL REPRESENTATION OF AUDIO
TLDR
A transformation of sound into a representation with various properties specifically oriented towards simulations of source separation, which explains how the principles of sources separation will be applied as the next step towards a fully functional source separator.
Hierarchic models of hearing for sound separation and reconstruction
  • D. Ellis
  • Physics
    Proceedings of IEEE Workshop on Applications of Signal Processing to Audio and Acoustics
  • 1993
In building a machine to detect and segregate individual components in sound mixtures, the best example to copy is the human auditory system. Several models of auditory organization implement various
A theory and computational model of auditory monaural sound separation
TLDR
A conceptual theory of how the auditory system uses monaural acoustic information to separate two simultaneous talkers and a computer model which is based on this theory are presented and a method to use this information for separating sounds is hypothesized.
Event formation and separation in musical sound
This thesis reviews psychoacoustic and neurophysiological studies that show how the human auditory system is capable of hearing out one source of sound from the mixture of sounds that reaches the
On the importance of time—a temporal representation of sound
TLDR
It is shown how the nonlinear-ity of the auditory system breaks this equivalence, and is especially important in analyzing complex sounds from multiple sources of different characteristics, as well as for extracting and representing temporal structure for both periodic and non-periodic signals.
Auditory streaming and the building of timbre.
TLDR
These experiments demonstrated that fusion and sequential organization of streams are carried out using two sorts of information which compete to determine the best perceptual description of the input.
Adequacy of auditory models to predict human internal representation of speech sounds.
  • O. Ghitza
  • Physics
    The Journal of the Acoustical Society of America
  • 1993
TLDR
A diagnostic system has been developed that simulates the psychophysical procedure used in the standard Diagnostic-Rhyme Test (DRT) and provides detailed diagnostics that show the error distributions among six phonetically distinctive features.
A computer implementation of psychoacoustic grouping rules
  • D. Ellis
  • Physics
    Proceedings of the 12th IAPR International Conference on Pattern Recognition, Vol. 2 - Conference B: Computer Vision & Image Processing. (Cat. No.94CH3440-5)
  • 1994
TLDR
This work describes an implementation of grouping rules corresponding to the psychoacoustic cues of harmonicity, common onset, continuity and proximity and increases the system's robustness by adding a second layer of grouping that looks for corroboration between primary groupings.
...
...