• Corpus ID: 53875109

Improving Peak-picking Using Multiple Time-step Loss Functions

  title={Improving Peak-picking Using Multiple Time-step Loss Functions},
  author={Carl Southall and Ryan Stables and Jason Hockman},
The majority of state-of-the-art methods for music infor-mation retrieval (MIR) tasks now utilise deep learningmethods reliant on minimisation of loss functions such ascross entropy. For tasks that include framewise binaryclassification (e.g., onset detection, music transcription)classes are derived from output activation functions byidentifying points of local maxima, or peaks. However, theoperating principles behind peak picking are different tothat of the cross entropy loss function, which… 

Figures from this paper

Improving Perceptual Quality of Drum Transcription with the Expanded Groove MIDI Dataset

This work optimize classifiers for downstream generation by predicting expressive dynamics (velocity) and shows with listening tests that they produce outputs with improved perceptual quality, despite achieving similar results on classification metrics.

A Streamlined Encoder/decoder Architecture for Melody Extraction

A novel streamlined encoder/decoder network that is designed for melody extraction in polyphonic musical audio can achieve result close to the state-of-the-art with much fewer convolutional layers and simpler convolution modules.

Towards Fully Integrated Real-time Detection Framework for Online Contents Analysis - RED-Alert Approach

The proposed solution is designed to ensure security and policing of online contents by detecting terrorist material and using social network analysis, speech recognition, face and object detection besides audio event detection to extract information from online sources that are fed in a complex event processor.

9 Complex Project to Develop Real Tools for Identifying and Countering Terrorism : Real-time Early Detection and Alert System for Online Terrorist Content Based on Natural Language Processing , Social Network Analysis , Artificial Intelligence and Complex Event Processing

This research highlights the need to understand more fully the role of emotion in the decision-making process and the role that language plays in the development of emotions.

Privacy-Preserving Social Media Forensic Analysis for Preventive Policing of Online Activities

  • Syed NaqviSean Enderby M. Florea
  • Computer Science
    2019 10th IFIP International Conference on New Technologies, Mobility and Security (NTMS)
  • 2019
Results of European H2020 project RED-Alert are presented that aims to enable secure and privacy preserving data processing and the malicious content and the corresponding personality can be tracked while the privacy of innocent citizens can be preserved.



Automatic Drum Transcription for Polyphonic Recordings Using Soft Attention Mechanisms and Convolutional Neural Networks

Two approaches to improve accuracy for polyphonic recordings of automatic drum transcription are presented, including the use of soft attention mechanisms (SA) and an alternative RNN configuration containing additional peripheral connections (PC) and a convolutional neural network (CNN), which uses a larger set of time-step features.

madmom: A New Python Audio and Music Signal Processing Library

Madmom is an open-source audio processing and music information retrieval (MIR) library written in Python that features a concise, NumPy-compatible, object oriented design with simple calling conventions and sensible default values for all parameters that facilitates fast prototyping of MIR applications.

Drum transcription from polyphonic music with recurrent neural networks

An approach to transcribe drums from polyphonic audio signals based on a recurrent neural network is presented and it is revealed that F-measure values higher than state of the art can be achieved using the proposed method.

Recurrent Neural Networks for Drum Transcription

It is claimed that recurrent neural networks can be trained to identify the onsets of percussive instruments based on general properties of their sound and are capable of generalizing reasonably well.

Universal Onset Detection with Bidirectional Long Short-Term Memory Neural Networks

This paper presents a new onset detector with superior performance and temporal precision for all kinds of music, including complex music mixes, based on auditory spectral features and relative spectral differences processed by a bidirectional Long Short-Term Memory recurrent neural network, which acts as reduction function.

Drum Transcription via Joint Beat and Drum Modeling Using Convolutional Recurrent Neural Networks

It is shown that convolutional and recurrentconvolutional neural networks perform better than state-ofthe-art methods and that learning beats jointly with drums can be beneficial for the task of drum detection.

Automatic Drum Transcription Using Bi-Directional Recurrent Neural Networks

A bi-directional recurrent Neural Network for offline detection of percussive onsets from specified drum classes and a recurrent neural network suitable for online operation are presented.

Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift

Applied to a state-of-the-art image classification model, Batch Normalization achieves the same accuracy with 14 times fewer training steps, and beats the original model by a significant margin.

Adam: A Method for Stochastic Optimization

This work introduces Adam, an algorithm for first-order gradient-based optimization of stochastic objective functions, based on adaptive estimates of lower-order moments, and provides a regret bound on the convergence rate that is comparable to the best known results under the online convex optimization framework.

A Review of Automatic Drum Transcription

This paper presents a comprehensive review of ADT research, including a thorough discussion of the task-specific challenges, categorization of existing techniques, and evaluation of several state-of-the-art systems.