• Publications
  • Influence
Automatic Tagging Using Deep Convolutional Neural Networks
The experiments show that mel-spectrogram is an effective time-frequency representation for automatic tagging and that more complex models benefit from more training data.
Transfer Learning for Music Classification and Regression Tasks
This paper proposes to use a pre-trained convnet feature, a concatenated feature vector using the activations of feature maps of multiple layers in a trained convolutional network, and shows how it can serve as general-purpose music representation.
Convolutional recurrent neural networks for music classification
It is found that CRNN show a strong performance with respect to the number of parameter and training time, indicating the effectiveness of its hybrid structure in music feature extraction and feature summarisation.
Text-based LSTM networks for Automatic Music Composition
The proposed network is designed to learn relationships within text documents that represent chord progressions and drum tracks in two cases, and word-RNNs and character-based RNNs show good results for both cases.
A Tutorial on Deep Learning for Music Information Retrieval
The basic principles and prominent works in deep learning for MIR are laid out and the network structures that have been successful in MIR problems are outlined to facilitate the selection of building blocks for the problems at hand.
Kapre: On-GPU Audio Preprocessing Layers for a Quick Implementation of Deep Neural Network Models with Keras
Kre implements time-frequency conversions, normalisation, and data augmentation as Keras layers for audio and music signal preprocessing and reports simple benchmark results, showing real-time on-GPU preprocessing adds a reasonable amount of computation.
Decoupling Magnitude and Phase Estimation with Deep ResUNet for Music Source Separation
This work proposes to estimate phases by estimating complex ideal ratio masks (cIRMs) where it decouple the estimation of cIRMs into magnitude and phase estimations, and extends the separation method to effectively allow the magnitude of the mask to be larger than 1.
Deep Unsupervised Drum Transcription
We introduce DrummerNet, a drum transcription system that is trained in an unsupervised manner. DrummerNet does not require any ground-truth transcription and, with the data-scalability of deep
Deep Learning for Audio-Based Music Classification and Tagging: Teaching Computers to Distinguish Rock from Bach
Over the last decade, music-streaming services have grown dramatically and giant technology companies such as Apple, Google, and Amazon have also been strengthening their music service platforms, providing listeners with a new and easily accessible way to listen to music.
A Comparison of Audio Signal Preprocessing Methods for Deep Neural Networks on Music Tagging
This paper empirically investigates the effect of audio preprocessing on music tagging with deep neural networks and shows that many commonly used input preprocessing techniques are redundant except magnitude compression.