Acoustic Scene Classification: Classifying environments from the sounds they produce

@article{Barchiesi2015AcousticSC,
  title={Acoustic Scene Classification: Classifying environments from the sounds they produce},
  author={Daniele Barchiesi and Dimitrios Giannoulis and Dan Stowell and Mark D. Plumbley},
  journal={IEEE Signal Processing Magazine},
  year={2015},
  volume={32},
  pages={16-34}
}
In this article, we present an account of the state of the art in acoustic scene classification (ASC), the task of classifying environments from the sounds they produce. Starting from a historical review of previous research in this area, we define a general framework for ASC and present different implementations of its components. We then describe a range of different algorithms submitted for a data challenge that was held to provide a general and fair benchmark for ASC techniques. The data… 

Figures and Tables from this paper

ACOUSTIC SCENE CLASSIFICATION: A COMPETITION REVIEW

TLDR
The methods and results discovered during a competition organized in the context of a graduate machine learning course are described and its importance in the curriculum based on student feedback is justified.

ACOUSTIC SCENE CLASSIFICATION USING SPATIAL FEATURES

TLDR
Preliminary analysis of EigenScape, a new dataset of 4th-order Ambisonic acoustic scene recordings, suggests that certain scenes that are spectrally similar might not necessarily be spatially similar.

Reverberation-based feature extraction for acoustic scene classification

TLDR
A strong low-complexity baseline system using a compact feature set that is improved with a novel class of audio features, which exploit the knowledge of sound behaviour within the scene - reverberation, which increases the classification accuracy.

ESC: Dataset for Environmental Sound Classification

TLDR
A new annotated collection of 2000 short clips comprising 50 classes of various common sound events, and an abundant unified compilation of 250000 unlabeled auditory excerpts extracted from recordings available through the Freesound project are presented.

Detection and Classification of Acoustic Scenes and Events

TLDR
The state of the art in automatically classifying audio scenes, and automatically detecting and classifyingaudio events is reported on.

AUDITORY SCENE CLASSIFICATION BASED ON THE SPECTRO-TEMPORAL STRUCTURE ANALYSIS

TLDR
This proposed system contains four modules to compute the representations describing spectro-temporal properties of audio data, which are derived from cochleagram and low-level audio feature contours.

Acoustic environment classification using discrete hartley transform features

  • H. JleedM. Bouchard
  • Computer Science
    2017 IEEE 30th Canadian Conference on Electrical and Computer Engineering (CCECE)
  • 2017
TLDR
Experiments show that the proposed method is competitive compared to other recently proposed methods, and that the use of the discrete Hartley transform improves the classification performance.

EigenScape : A Database of Spatial Acoustic Scene Recordings

TLDR
EigenScape is introduced, a new database of fourth-order Ambisonic recordings of eight different acoustic scene classes that validate the new database and show that spatial features can characterise acoustic scenes and as such are worthy of further investigation.

Label Tree Embeddings for Acoustic Scene Classification

TLDR
An efficient approach for acoustic scene classification by exploring the structure of class labels by collectively optimizing a clustering of the labels into multiple meta-classes in a tree structure is presented.

A Review of Deep Learning Based Methods for Acoustic Scene Classification

TLDR
This article summarizes and groups existing approaches for data preparation, i.e., feature representations, feature pre-processing, and data augmentation, and for data modeling, i.
...

References

SHOWING 1-10 OF 58 REFERENCES

A database and challenge for acoustic scene classification and event detection

TLDR
This paper introduces a newly-launched public evaluation challenge dealing with two closely related tasks of the field: acoustic scene classification and event detection.

Computational auditory scene recognition

TLDR
This paper addresses the problem of computational auditory scene recognition and describes methods to classify auditory scenes into predefined classes using band-energy ratio features with 1-NN classifier and Mel-frequency cepstral coefficients with Gaussian mixture models.

Unsupervised Learning of Acoustic Unit Descriptors for Audio Content Representation and Classification

TLDR
This paper uses audio from multi-class Youtube-quality multimedia data to converge on a set of sound units, such that each audio file is represented as a sequence of these units, and tries to learn category language models over sequences of acoustic units.

Audio-based context awareness - acoustic modeling and perceptual evaluation

  • A. EronenJ. Tuomi J. Huopaniemi
  • Computer Science
    2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03).
  • 2003
TLDR
Direct comparison to human performance indicates that the system performs somewhat worse than human subjects do in the recognition of 18 everyday contexts and almost comparably in recognizing six higher level categories.

Water sound recognition based on physical models

TLDR
This article describes an audio signal processing algorithm to detect water sounds, built in the context of a larger system aiming to monitor daily activities of elderly people, based on a physical model of air bubble acoustics.

Automatic Recognition of Urban Sound Sources

The goal of the FDAI project is to create a general system that computes an efficient representation of the acoustic environment. More precisely, FDAI has to compute a noise disturbance indicator

Characterisation of acoustic scenes using a temporally-constrained shift-invariant model

TLDR
Results show that the proposed model is able to model salient events within a scene and outperforms the non-negative matrix factorization algorithm for the same task and it is demonstrated that the use of temporal constraints can lead to improved performance.

Prediction-driven computational auditory scene analysis

TLDR
A blackboard-based implementation of the 'prediction-driven' approach is described which analyzes dense, ambient sound examples into a vocabulary of noise clouds, transient clicks, and a correlogram-based representation of wide-band periodic energy called the weft.

Environmental Sound Recognition With Time–Frequency Audio Features

TLDR
An empirical feature analysis for audio environment characterization is performed and a matching pursuit algorithm is proposed to use to obtain effective time-frequency features to yield higher recognition accuracy for environmental sounds.

Audio-based context recognition

TLDR
This paper investigates the feasibility of an audio-based context recognition system developed and compared to the accuracy of human listeners in the same task, with particular emphasis on the computational complexity of the methods.
...