Corpus ID: 236493514

Pitch-Informed Instrument Assignment Using a Deep Convolutional Network with Multiple Kernel Shapes

  title={Pitch-Informed Instrument Assignment Using a Deep Convolutional Network with Multiple Kernel Shapes},
  author={Carlos Lordelo and Emmanouil Benetos and Simon Dixon and Sven Ahlback},
This paper proposes a deep convolutional neural network for performing note-level instrument assignment. Given a polyphonic multi-instrumental music signal along with its ground truth or predicted notes, the objective is to assign an instrumental source for each note. This problem is addressed as a pitch-informed classification task where each note is analysed individually. We also propose to utilise several kernel shapes in the convolutional layers in order to facilitate learning of efficient… Expand

Figures and Tables from this paper


Frame-level Instrument Recognition by Timbre and Pitch
This paper builds and evaluates a convolutional neural network for making frame-level instrument prediction, and considers it as a multi-label classification problem for each frame and uses frame- level annotations as the supervisory signal in training the network. Expand
Music instrument recognition using deep convolutional neural networks
A deep convolution neural network framework for predominant instrument recognition in real-world polyphonic music is accomplished and the research excellent result with 92.8% accuracy. Expand
Deep Convolutional Neural Networks for Predominant Instrument Recognition in Polyphonic Music
The analysis on the instrument-wise performance found that the onset type is a critical factor for recall and precision of each instrument, and convolutional neural networks are more robust than conventional methods that exploit spectral features and source separation with support vector machines. Expand
Multitask Learning for Frame-level Instrument Recognition
A large-scale dataset that contains synthetic polyphonic music with frame-level pitch and instrument labels is presented and a simple yet novel network architecture is proposed to jointly predict the Pitch and instrument for each frame and the effectiveness of the proposed method is validated. Expand
Deep Salience Representations for F0 Estimation in Polyphonic Music
A fully convolutional neural network for learning salience representations for estimating fundamental frequencies, trained using a large, semi-automatically generated f0 dataset is described and shown to achieve state-of-the-art performance on several multi-f0 and melody datasets. Expand
Timbre analysis of music audio signals with convolutional neural networks
One of the main goals of this work is to design efficient CNN architectures — what reduces the risk of these models to over-fit, since CNNs' number of parameters is minimized. Expand
An Attention Mechanism for Musical Instrument Recognition
The proposed attention model is compared to multiple models which include a baseline binary relevance random forest, recurrent neural network, and fully connected neural networks to show that incorporating attention leads to an overall improvement in classification accuracy metrics across all 20 instruments in the OpenMIC dataset. Expand
Experimenting with musically motivated convolutional neural networks
  • Jordi Pons, T. Lidy, X. Serra
  • Computer Science
  • 2016 14th International Workshop on Content-Based Multimedia Indexing (CBMI)
  • 2016
This article explores various architectural choices of relevance for music signals classification tasks in order to start understanding what the chosen networks are learning and proposes several musically motivated architectures. Expand
Investigating Kernel Shapes and Skip Connections for Deep Learning-Based Harmonic-Percussive Separation
An efficient deep learning encoder-decoder network for performing Harmonic-Percussive Source Separation is proposed and it is shown that the number of model trainable parameters is greatly reduced by using a dense arrangement of skip connections between the model layers. Expand
Multi-Instrument Automatic Music Transcription With Self-Attention-Based Instance Segmentation
This article proposes a multi-instrument AMT method, with signal processing techniques specifying pitch saliency, novel deep learning techniques, and concepts partly inspired by multi-object recognition, instance segmentation, and image-to-image translation in computer vision. Expand