ESC: Dataset for Environmental Sound Classification

@article{Piczak2015ESCDF,
  title={ESC: Dataset for Environmental Sound Classification},
  author={Karol J. Piczak},
  journal={Proceedings of the 23rd ACM international conference on Multimedia},
  year={2015}
}
  • Karol J. Piczak
  • Published 13 October 2015
  • Computer Science
  • Proceedings of the 23rd ACM international conference on Multimedia
One of the obstacles in research activities concentrating on environmental sound classification is the scarcity of suitable and publicly available datasets. This paper tries to address that issue by presenting a new annotated collection of 2000 short clips comprising 50 classes of various common sound events, and an abundant unified compilation of 250000 unlabeled auditory excerpts extracted from recordings available through the Freesound project. The paper also provides an evaluation of human… 

Figures from this paper

Generalisation in Environmental Sound Classification: The ‘Making Sense of Sounds’ Data Set and Challenge
TLDR
A baseline classification system is introduced, a deep convolutional network, which showed strong performance with an average accuracy on the evaluation data, and is discussed in the light of two alternative explanations: An unlikely accidental category bias in the sound recordings or a more plausible true acoustic grounding of the high-level categories.
Environmental sound classification with convolutional neural networks
  • Karol J. Piczak
  • Computer Science
    2015 IEEE 25th International Workshop on Machine Learning for Signal Processing (MLSP)
  • 2015
TLDR
The model outperforms baseline implementations relying on mel-frequency cepstral coefficients and achieves results comparable to other state-of-the-art approaches.
CHALLENGES AND ISSUES OF SOUND ARCHIVES FOR ENVIRONMENTAL SOUND CLASSIFICATION
TLDR
The activities of the researches to identify and classify these sounds by using many techniques are discussed, which may help in various fields such as hearing impairment treatment, criminal activities prevention, forensic science, humanoid robots.
CnnSound: Convolutional Neural Networks for the Classification of Environmental Sounds
TLDR
150 different CNN-based models were designed by changing the number of layers and values of their tuning parameters used in the layers to develop more robust convolution neural networks architecture (CNN) and the obtained accuracy has been found to be better and satisfactory when both accuracy and computational complexity are considered.
Deep Convolutional Neural Network with Mixup for Environmental Sound Classification
TLDR
A novel deep convolutional neural network is proposed to be used for environmental sound classification (ESC) tasks that uses stacked Convolutional and pooling layers to extract high-level feature representations from spectrogram-like features.
An Ensemble of Convolutional Neural Networks for Audio Classification
TLDR
This work has managed to create an off-the-shelf ensemble that can be trained on different datasets and reach performances competitive with the state of the art in audio classification.
...
...

References

SHOWING 1-10 OF 19 REFERENCES
A Dataset and Taxonomy for Urban Sound Research
TLDR
A taxonomy of urban sounds and a new dataset, UrbanSound, containing 27 hours of audio with 18.5 hours of annotated sound event occurrences across 10 sound classes are presented.
Acoustic Scene Classification: Classifying environments from the sounds they produce
TLDR
An account of the state of the art in acoustic scene classification (ASC), the task of classifying environments from the sounds they produce, and a range of different algorithms submitted for a data challenge to provide a general and fair benchmark for ASC techniques.
Environmental sound classification with convolutional neural networks
  • Karol J. Piczak
  • Computer Science
    2015 IEEE 25th International Workshop on Machine Learning for Signal Processing (MLSP)
  • 2015
TLDR
The model outperforms baseline implementations relying on mel-frequency cepstral coefficients and achieves results comparable to other state-of-the-art approaches.
Sound representation and classification benchmark for domestic robots
TLDR
The problem of sound representation and classification is addressed and a comparative study in the context of a domestic robotic scenario is presented and methods quantitatively are assessed on the basis of their classification scores, computation times and memory requirements.
A Bag-of-Features approach to acoustic event detection
TLDR
A novel approach for classifying acoustic events that is based on a Bag-of-Features approach is proposed, whereMel and gammatone frequency cepstral coefficients that originate from psychoacoustic models are used as input features for the Bag- of representation.
Detection and classification of acoustic scenes and events: An IEEE AASP challenge
TLDR
An overview of systems submitted to the public evaluation challenge on acoustic scene classification and detection of sound events within a scene as well as a detailed evaluation of the results achieved by those systems are provided.
Acoustic Scene Classification
TLDR
An account of the state-of-the-art in acoustic scene classification (ASC), the task of classifying environments from the sounds they produce, and a range of different algorithms submitted for a data challenge to provide a general and fair benchmark for ASC techniques.
Environmental sound recognition: A survey
TLDR
This survey will offer a qualitative and elucidatory survey on recent developments of environmental sound recognition, and includes three parts: i) basic environmental sound processing schemes, ii) stationary ESR techniques and iii) non-stationary E SR techniques.
Content-based Retrieval of Environmental Sounds by Multiresolution Analysis
TLDR
Experimental results show that the approach always outperforms a method based on traditional MFCC features and Euclidean distance, improving retrieval rates from 51% to 62% and the similarity measure is defined in terms of the Kullback-Leibler divergence between GGDs.
An Open Dataset for Research on Audio Field Recording Archives: freefield1010
TLDR
A free and open dataset of 7690 audio clips sampled from the field-recording tag in the Freesound audio archive is introduced, describing the data preparation process, characterise the dataset descriptively, and illustrate its use through an auto-tagging experiment.
...
...