• Corpus ID: 50785886

A multi-device dataset for urban acoustic scene classification

@article{Mesaros2018AMD,
  title={A multi-device dataset for urban acoustic scene classification},
  author={Annamaria Mesaros and Toni Heittola and Tuomas Virtanen},
  journal={ArXiv},
  year={2018},
  volume={abs/1807.09840}
}
This paper introduces the acoustic scene classification task of DCASE 2018 Challenge and the TUT Urban Acoustic Scenes 2018 dataset provided for the task, and evaluates the performance of a baseline system in the task. As in previous years of the challenge, the task is defined for classification of short audio samples into one of predefined acoustic scene classes, using a supervised, closed-set classification setup. The newly recorded TUT Urban Acoustic Scenes 2018 dataset consists of ten… 

Figures and Tables from this paper

Detection and Classification of Acoustic Scenes and Events 2019 Challenge ACOUSTIC SCENE CLASSIFICATION USING ATTENTION-BASED CONVOLUTIONAL NEURAL NETWORK Technical Report
TLDR
Mel-spectrogram is used as audio feature and deep convolutional neural networks (CNNs) as classifier to classify acoustic scenes and the best model can achieve classification accuracy of around 70.7% for Development dataset.
Cross-task pre-training for acoustic scene classification
TLDR
This work explored cross-task pre-training mechanism to utilize acoustic event information extracted from the pre-trained model to optimize the ASC task and showed that cross- Task Pre- Training mechanism can significantly improve the performance of ASC tasks.
CROSS-TASK PRE-TRAINING FOR ACOUSTIC SCENE CLASSIFICATION
  • Wei Zou
  • Computer Science, Physics
  • 2019
TLDR
This work explored cross-task pre-training mechanism to utilize acoustic event information extracted from the pre-trained model to optimize the ASC task and showed that cross- Task Pre- Training mechanism can significantly improve the performance of ASC tasks.
Low-Complexity Acoustic Scene Classification for Multi-Device Audio: Analysis of DCASE 2021 Challenge Systems
TLDR
The acoustic scene classification task remained a popular task in the challenge, despite the increasing difficulty of the setup, and the most used techniques among the submissions were residual networks and weight quantization.
Detection and Classification of Acoustic Scenes and Events 2019 Challenge URBAN ACOUSTIC SCENE CALSSIFICATION USING RAW WAVEFORM CONVOLUTIONAL NEURAL NETWORKS Technical Report
TLDR
The signal processing framework and the results obtained with the development dataset of the proposed RW-CNN for the detection and classification of acoustic scenes and events (DCASE 2019) challenge are presented.
ACOUSTIC SCENE CLASSIFICATION USING CNN ENSEMBLES AND PRIMARY AMBIENT EXTRACTION Technical Report
TLDR
This report describes the submission for Task 1a (acoustic scene classification) of the DCASE 2019 challenge and the method continues to work on the ensembles of CNNs, whereas the primary ambient extraction is newly introduced to decompose a binaural audio sample into four channels by using the spatial information.
WAVELET-BASED AUDIO FEATURES FOR ACOUSTIC SCENE CLASSIFICATION Technical Report
TLDR
Two wavelet-based features in a scorefusion framework are found to be complementary so that the fused system relatively outperforms the deep-learning based baseline system with the development dataset provided for the respective sub-tasks.
Domain Adaptation Neural Network for Acoustic Scene Classification in Mismatched Conditions
TLDR
The proposed DANN based acoustic scene classification method is evaluated on the subtask B of task 1 of the DCASE 2019 ASC challenge, which is a closed-set classification problem whose audio recordings were recorded by mismatch devices.
Cross-task pre-training for on-device acoustic scene classification.
TLDR
This paper presents the cross-task pre-training mechanism which utilizes acoustic event information from the pre-trained AED model for ASC tasks to improve the performance of ASC tasks.
...
...

References

SHOWING 1-10 OF 10 REFERENCES
A convolutional neural network approach for acoustic scene classification
TLDR
This paper proposes the use of a CNN trained to classify short sequences of audio, represented by their log-mel spectrogram, and introduces a training method that can be used under particular circumstances in order to make full use of small datasets.
Detection and Classification of Acoustic Scenes and Events: Outcome of the DCASE 2016 Challenge
TLDR
The emergence of deep learning as the most popular classification method is observed, replacing the traditional approaches based on Gaussian mixture models and support vector machines.
Acoustic Scene Classification: An Overview of Dcase 2017 Challenge Entries
TLDR
Analysis of the submissions confirms once more the popularity of deep-learning approaches and mel frequency representations in acoustic scene classification, and indicates that combinations of top systems are capable of reaching close to perfect performance on the given data.
Acoustic Scene Classification: Classifying environments from the sounds they produce
TLDR
An account of the state of the art in acoustic scene classification (ASC), the task of classifying environments from the sounds they produce, and a range of different algorithms submitted for a data challenge to provide a general and fair benchmark for ASC techniques.
On the use of spectro-temporal features for the IEEE AASP challenge ‘detection and classification of acoustic scenes and events’
TLDR
It is demonstrated that the proposed spectro-temporal features achieve a better recognition accuracy than MFCCs.
The bag-of-frames approach to audio pattern recognition: a sufficient model for urban soundscapes but not for polyphonic music.
TLDR
This paper proposes to explicitly examine the difference between urban soundscapes and polyphonic music with respect to their modeling with the BOF approach, and reveals critical differences in the temporal and statistical structure of the typical frame distribution of each type of signal.
Panako - A Scalable Acoustic Fingerprinting System Handling Time-Scale and Pitch Modification
TLDR
A scalable granular acoustic fingerprinting system robust against time and pitch scale modification is presented, designed to be robust against pitch shifting, time stretching and tempo changes, while remaining scalable.
Context awareness using environmental noise classification
TLDR
The approach for automatically sensing and recognising noise from typical environments of daily life, such as office, car and city street, is described and the hidden Markov model based noise classifier is presented.
Histogram of Gradients of Time–Frequency Representations for Audio Scene Classification
TLDR
The approach for classifying acoustic scenes is based on transforming the audio signal into a time-frequency representation and then in extracting relevant features about shapes and evolutions of time- frequency structures based on histogram of gradients that are subsequently fed to a multi-class linear support vector machines.
Adam: A Method for Stochastic Optimization
TLDR
This work introduces Adam, an algorithm for first-order gradient-based optimization of stochastic objective functions, based on adaptive estimates of lower-order moments, and provides a regret bound on the convergence rate that is comparable to the best known results under the online convex optimization framework.