• Corpus ID: 197935467

Data Augmentation for Instrument Classification Robust to Audio Effects

  title={Data Augmentation for Instrument Classification Robust to Audio Effects},
  author={Ant{\'o}nio Ramires and Xavier Serra},
Comunicacio presentada a la 22a International Conference on Digital Audio Effects (DAFx-19) que se celebra del 2 al 6 de setembre de 2019 a Birmingham, Regne Unit. 

Figures and Tables from this paper

Use of Speaker Recognition Approaches for Learning and Evaluating Embedding Representations of Musical Instrument Sounds
A musical instrument recognition model that uses a SincNet front-end, a ResNet architecture, and an angular softmax objective function is constructed, which shows that including instrument family labels as a multi-task learning target can help to regularize the embedding space and incorporate useful structure.
COALA: Co-Aligned Autoencoders for Learning Semantically Enriched Audio Representations
The results are promising, sometimes in par with the state-of-the-art in the considered tasks and the embeddings produced with the method are well correlated with some acoustic descriptors.
Use of speaker recognition approaches for learning timbre representations of musical instrument sounds from raw waveforms
A group of trainable filters are introduced to generate proper acoustic features from input raw waveforms, making it easier for a model to be optimized in an input-agnostic and end-to-end manner.
Integrating Machine Learning with Human Knowledge


A real-time system for measuring sound goodness in instrumental sounds
Comunicacio presentada a la 138th Audio Engineering Society Convention, celebrada a Varsovia (Polonia) els dies 7 a 10 de maig de 2015 i organitzada per la Audio Engineering Society.
A Software Framework for Musical Data Augmentation
This work develops a general software framework for augmenting annotated musical datasets, which will allow practitioners to easily expand training sets with musically motivated perturbations of both audio and annotations.
A Comparison of Sound Segregation Techniques for Predominant Instrument Recognition in Musical Audio Signals
The authors address the identification of predominant music instruments in polytimbral audio by previously dividing the original signal into several streams, and show that the performance was only enhanced if the recognition models are trained with the features extracted from the separated audio streams.
Neural Audio Synthesis of Musical Notes with WaveNet Autoencoders
A powerful new WaveNet-style autoencoder model is detailed that conditions an autoregressive decoder on temporal codes learned from the raw audio waveform, and NSynth, a large-scale and high-quality dataset of musical notes that is an order of magnitude larger than comparable public datasets is introduced.
Timbre analysis of music audio signals with convolutional neural networks
One of the main goals of this work is to design efficient CNN architectures — what reduces the risk of these models to over-fit, since CNNs' number of parameters is minimized.
A study on data augmentation of reverberant speech for robust speech recognition
It is found that the performance gap between using simulated and real RIRs can be eliminated when point-source noises are added, and the trained acoustic models not only perform well in the distant- talking scenario but also provide better results in the close-talking scenario.
Musical instrument sound classification with deep convolutional neural network using feature fusion approach
A new musical instrument classification method using convolutional neural networks (CNNs) is presented, which improves over a system that uses only a spectrogram and outperforms the baseline result from traditional handcrafted features and classifiers.
Automatic Instrument Recognition in Polyphonic Music Using Convolutional Neural Networks
It is shown that a convolutional neural network trained on raw audio can achieve performance surpassing traditional methods that rely on hand-crafted features.
Deep Convolutional Neural Networks for Predominant Instrument Recognition in Polyphonic Music
The analysis on the instrument-wise performance found that the onset type is a critical factor for recall and precision of each instrument, and convolutional neural networks are more robust than conventional methods that exploit spectral features and source separation with support vector machines.
DAFX:Digital Audio Effects
DAFX - Digital Audio Effects features contributions from Daniel Arfib, Xavier Amatrain, Jordi Bonada, Giovanni de Poli, Pierre Dutilleux, Gianpaolo Evangelista, Florian Keiler, Alex Loscos, Davide Rocchesso, Mark Sandler, Xavier Serra, and Todor Todoroff.