Corpus ID: 221005886

Few-Shot Drum Transcription in Polyphonic Music

  title={Few-Shot Drum Transcription in Polyphonic Music},
  author={Yu Wang and Justin Salamon and Mark Cartwright and Nicholas J. Bryan and Juan Pablo Bello},
  • Yu Wang, J. Salamon, +2 authors J. Bello
  • Published in ISMIR 2020
  • Computer Science, Engineering
Data-driven approaches to automatic drum transcription (ADT) are often limited to a predefined, small vocabulary of percussion instrument classes. Such models cannot recognize out-of-vocabulary classes nor are they able to adapt to finer-grained vocabularies. In this work, we address open vocabulary ADT by introducing few-shot learning to the task. We train a Prototypical Network on a synthetic dataset and evaluate the model on multiple real-world ADT datasets with polyphonic accompaniment. We… Expand
4 Citations
Few-Shot Continual Learning for Audio Classification
This work introduces a few-shot continual learning framework for audio classification, where a trained base classifier is continuously expanded to recognize novel classes based on only few labeled data at inference time, which enables fast and interactive model updates by end-users with minimal human effort. Expand
Leveraging Hierarchical Structures for Few-Shot Musical Instrument Recognition
This work applies a hierarchical loss function to the training of prototypical networks and a method to aggregate prototypes hierarchically, mirroring the structure of a predefined musical instrument hierarchy, to enable classification of a wider set of musical instruments. Expand
User-Driven Fine-Tuning for Beat Tracking
  • António Pinto, Sebastian Böck, Jaime Cardoso, Matthew Davies
  • Computer Science
  • Electronics
  • 2021
This paper explored the use of targeted fine-tuning of a state-of-the-art deep neural network based on a very limited temporal region of annotated beat locations and demonstrated the success of this approach via improved performance across existing annotated datasets and a new annotation-correction approach for evaluation. Expand
The report presents the results of submission to Task 5 (Few-shot Bioacoustics Event Detection) of Detection and Classification of Acoustic Scenes and Events Challenge (DCASE) 2021. This task focusesExpand


Increasing drum transcription vocabulary using data synthesis
This paper proposes to support large-vocabulary drum transcription by generating a large synthetic dataset (210,000 eight second examples) of audio examples for which the authors have groundtruth transcriptions, and trains convolutional-recurrent neural networks (CRNNs) in a multi-task framework to support big-vocabularies ADT. Expand
Drum transcription from polyphonic music with recurrent neural networks
An approach to transcribe drums from polyphonic audio signals based on a recurrent neural network is presented and it is revealed that F-measure values higher than state of the art can be achieved using the proposed method. Expand
Automatic Drum Transcription for Polyphonic Recordings Using Soft Attention Mechanisms and Convolutional Neural Networks
Two approaches to improve accuracy for polyphonic recordings of automatic drum transcription are presented, including the use of soft attention mechanisms (SA) and an alternative RNN configuration containing additional peripheral connections (PC) and a convolutional neural network (CNN), which uses a larger set of time-step features. Expand
Multi-label Few-shot Learning for Sound Event Recognition
A One-vs.-Rest episode selection strategy is proposed to mitigate the issue of the complexity of forming an episode and apply the strategy to the multi-label few-shot problem. Expand
MDB Drums: An annotated subset of MedleyDB for automatic drum transcription
This dataset is built on top of the MusicDelta subset of the MedleyDB dataset, taking advantage of real-world recordings in multitrack format, providing a balanced pool for developing and evaluating ADT models with respect to various musical styles. Expand
A Closer Look at Few-shot Classification
The results reveal that reducing intra-class variation is an important factor when the feature backbone is shallow, but not as critical when using deeper backbones, and a baseline method with a standard fine-tuning practice compares favorably against other state-of-the-art few-shot learning algorithms. Expand
Learning to Compare: Relation Network for Few-Shot Learning
A conceptually simple, flexible, and general framework for few-shot learning, where a classifier must learn to recognise new classes given only few examples from each, which is easily extended to zero- shot learning. Expand
Few-Shot Sound Event Detection
This work adapts state-of-the-art metric-based few-shot learning methods to automate the detection of similar-sounding events, requiring only one or few examples of the target event, and develops a method to automatically construct a partial set of labeled examples to reduce user labeling effort. Expand
Automatic Drum Transcription Using the Student-Teacher Learning Paradigm with Unlabeled Music Data
This work addresses the challenge of insufficiently labeled data by exploring the possibility of utilizing unlabeled music data from online resources by training a student neural network using the labels generated from multiple teacher systems. Expand
Prototypical Networks for Few-shot Learning
This work proposes Prototypical Networks for few-shot classification, and provides an analysis showing that some simple design decisions can yield substantial improvements over recent approaches involving complicated architectural choices and meta-learning. Expand