Cure Dataset: Ladder Networks for Audio Event Classification

  title={Cure Dataset: Ladder Networks for Audio Event Classification},
  author={Harishchandra Dubey and Dimitra Emmanouilidou and Ivan Tashev},
  journal={2019 IEEE Pacific Rim Conference on Communications, Computers and Signal Processing (PACRIM)},
Audio event classification is an important task for several applications such as surveillance, audio, video and multimedia retrieval etc. There are approximately 340 million people with hearing loss who can’t perceive events happening around them. This paper establishes the CURE dataset which contains curated set of specific audio events most relevant for people with hearing loss. It is formatted as 5 sec sound recordings derived from the Freesound project. We propose a ladder network based… 

Figures and Tables from this paper

Sentiment analysis using semi-supervised learning with few labeled data
This work presents an approach for leveraging contextual features from unlabeled movie and restaurant reviews with a neural-network-based learning model, Ladder network, and shows that the model outperforms the baseline models including LSTM and SVM.
Soft-Median Choice: An Automatic Feature Smoothing Method for Sound Event Detection
A novel automatic feature smoothing algorithm based on Soft-Median Choice that obtains significantly better scores than the referential algorithms is proposed.


Knowledge Transfer from Weakly Labeled Audio Using Convolutional Neural Network for Sound Events and Scenes
This work describes a convolutional neural network (CNN) based framework for sound event detection and classification using weakly labeled audio data and proposes methods to learn representations using this model which can be effectively used for solving the target task.
DCASE 2018 Challenge baseline with convolutional neural networks
Python implementation of DCASE 2018 has five tasks: 1) Acoustic scene classification, 2) General-purpose audio tagging, 3) Bird audio detection, 4) Weakly-labeled semi-supervised sound event detection and 5) Multi-channel audio tagging; the baseline source code contains the implementation of convolutional neural networks, including AlexNetish and VGGish -- networks originating from computer vision.
Audio Event Detection using Weakly Labeled Data
It is shown that audio event detection using weak labels can be formulated as an Multiple Instance Learning problem and two frameworks for solving multiple-instance learning are suggested, one based on support vector machines, and the other on neural networks.
Audio Set: An ontology and human-labeled dataset for audio events
The creation of Audio Set is described, a large-scale dataset of manually-annotated audio events that endeavors to bridge the gap in data availability between image and audio research and substantially stimulate the development of high-performance audio event recognizers.
Detection and Classification of Acoustic Scenes and Events: Outcome of the DCASE 2016 Challenge
The emergence of deep learning as the most popular classification method is observed, replacing the traditional approaches based on Gaussian mixture models and support vector machines.
Convolutional Recurrent Neural Networks for Polyphonic Sound Event Detection
This work combines these two approaches in a convolutional recurrent neural network (CRNN) and applies it on a polyphonic sound event detection task and observes a considerable improvement for four different datasets consisting of everyday sound events.
TUT database for acoustic scene classification and sound event detection
The recording and annotation procedure, the database content, a recommended cross-validation setup and performance of supervised acoustic scene classification system and event detection baseline system using mel frequency cepstral coefficients and Gaussian mixture models are presented.
DCASE2017 Challenge Setup: Tasks, Datasets and Baseline System
This paper presents the setup of these tasks: task definition, dataset, experimental setup, and baseline system results on the development dataset.
DNN-Based Audio Scene Classification for DCASE2017: Dual Input Features, Balancing Cost, and Stochastic Data Duplication
A new fine-tune cost that solves the drawback of dual input features was developed, as well as a data duplication method that enables DNN to clearly discriminate frequently misclassified classes.
Audio Based Event Detection for Multimedia Surveillance
The results show that the proposed top-down event detection approach works significantly better than the single level approach.