• Corpus ID: 214797192

ATTENTION-BASED CONVOLUTIONAL NEURAL NETWORK FOR AUDIO EVENT CLASSIFICATION WITH FEATURE TRANSFER LEARNING

@inproceedings{Chen2018ATTENTIONBASEDCN,
  title={ATTENTION-BASED CONVOLUTIONAL NEURAL NETWORK FOR AUDIO EVENT CLASSIFICATION WITH FEATURE TRANSFER LEARNING},
  author={Tianxiang Chen and Udit Gupta},
  year={2018}
}
Audio event classification is an urgent Content based Information Retrieval (CBIR) unsolved problem with numerous applications that it can benefit. This paper is explaining Pindrop’s submission to the ”Making Sense of Sound” challenge. In this submission we address the challenge of classifying audio excerpts based on their origin by using Convolutional Neural Networks with feature transfer learning. We use pretrained VGGish network to extract feature embeddings. Our results show a remarkable… 

Figures and Tables from this paper

PANNs: Large-Scale Pretrained Audio Neural Networks for Audio Pattern Recognition
TLDR
This paper proposes pretrained audio neural networks (PANNs) trained on the large-scale AudioSet dataset, and investigates the performance and computational complexity of PANNs modeled by a variety of convolutional neural networks.

References

SHOWING 1-10 OF 11 REFERENCES
CNN architectures for large-scale audio classification
TLDR
This work uses various CNN architectures to classify the soundtracks of a dataset of 70M training videos with 30,871 video-level labels, and investigates varying the size of both training set and label vocabulary, finding that analogs of the CNNs used in image classification do well on the authors' audio classification task, and larger training and label sets help up to a point.
Audio Set Classification with Attention Model: A Probabilistic Perspective
This paper investigates the Audio Set classification. Audio Set is a large scale weakly labelled dataset (WLD) of audio clips. In WLD only the presence of a label is known, without knowing the
Exploring Data Augmentation for Improved Singing Voice Detection with Neural Networks
TLDR
A range of label-preserving audio transformations are applied and pitch shifting is found to be the most helpful augmentation method for music data augmentation, reaching the state of the art on two public datasets.
Audio Set: An ontology and human-labeled dataset for audio events
TLDR
The creation of Audio Set is described, a large-scale dataset of manually-annotated audio events that endeavors to bridge the gap in data availability between image and audio research and substantially stimulate the development of high-performance audio event recognizers.
ESC: Dataset for Environmental Sound Classification
TLDR
A new annotated collection of 2000 short clips comprising 50 classes of various common sound events, and an abundant unified compilation of 250000 unlabeled auditory excerpts extracted from recordings available through the Freesound project are presented.
Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift
TLDR
Applied to a state-of-the-art image classification model, Batch Normalization achieves the same accuracy with 14 times fewer training steps, and beats the original model by a significant margin.
Adam: A Method for Stochastic Optimization
TLDR
This work introduces Adam, an algorithm for first-order gradient-based optimization of stochastic objective functions, based on adaptive estimates of lower-order moments, and provides a regret bound on the convergence rate that is comparable to the best known results under the online convex optimization framework.
Progressive Neural Networks
TLDR
This work evaluates this progressive networks architecture extensively on a wide variety of reinforcement learning tasks, and demonstrates that transfer occurs at both low-level sensory and high-level control layers of the learned policy.
Freesound technical demo
TLDR
This demo wants to introduce Freesound to the multimedia community and show its potential as a research resource.
Vggish: A vgg-like audio classification model
  • https://github.com/DTaoo/VGGish, 2017.
  • 2017
...
...