Source/Filter Model for Unsupervised Main Melody Extraction From Polyphonic Audio Signals
@article{Durrieu2010SourceFilterMF, title={Source/Filter Model for Unsupervised Main Melody Extraction From Polyphonic Audio Signals}, author={Jean-Louis Durrieu and Ga{\"e}l Richard and Bertrand David and C{\'e}dric F{\'e}votte}, journal={IEEE Transactions on Audio, Speech, and Language Processing}, year={2010}, volume={18}, pages={564-575} }
Extracting the main melody from a polyphonic music recording seems natural even to untrained human listeners. To a certain extent it is related to the concept of source separation, with the human ability of focusing on a specific source in order to extract relevant information. In this paper, we propose a new approach for the estimation and extraction of the main melody (and in particular the leading vocal part) from polyphonic audio signals. To that aim, we propose a new signal model where the…
185 Citations
MAIN MELODY EXTRACTION WITH SOURCE-FILTER NMF AND CRNN
- Computer Science
- 2018
A convolutional recurrent neural network architecture that relies on a particular form of pretraining by source-filter nonneg-ative matrix factorisation to estimate the dominant melody of a polyphonic audio recording achieves state-of-the-art performance on the MedleyDB dataset without any augmentation methods or large training sets.
Main Melody Estimation with Source-Filter NMF and CRNN
- Computer ScienceISMIR
- 2018
This work proposes to enhance the NMF-based salience representations with CNN layers, then to model the temporal structure by an RNN network and to estimate the dominant melody with a final classification layer, and shows that such a system achieves state-of-the-art performance on the MedleyDB dataset without any augmentation methods or large training sets.
Automatic transcription of the melody from polyphonic music
- Computer Science
- 2017
An efficient computational method for auditory stream segregation that processes a variable number of simultaneous voices that allows a very efficient computation of the melody.
Improving melody extraction using Probabilistic Latent Component Analysis
- Computer Science2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
- 2011
Quantitative evaluation shows that the new PLCA-based melody extraction algorithm performs significantly better than two existing melody extraction algorithms for polyphonic single-channel mixtures.
On-Line Melody Extraction From Polyphonic Audio Using Harmonic Cluster Tracking
- Computer ScienceIEEE Transactions on Audio, Speech, and Language Processing
- 2013
A novel framework which estimates predominant vocal melody in real-time by tracking various sources with the help of harmonic clusters (combs) and then determining the predominant vocal source by using the harmonic strength of the source.
From Heuristics-Based to Data-Driven Audio Melody Extraction
- Computer Science
- 2017
The combination of supervised and unsupervised approaches leads to advancements on melody extraction and shows a promising path for future research and applications.
Robust Singer Identification in Polyphonic Music using Melody Enhancement and Uncertainty-based Learning
- Computer ScienceISMIR
- 2012
New methods to estimate the uncertainty from the signal in a fully automatic manner and to learn the classifier directly from polyphonic data are introduced.
Towards Computational Auditory Scene Analysis: Melody Extraction from Polyphonic Music
- Computer Science
- 2012
The method is a further development of an algorithm which was successfully evaluated as part of a melody ex- traction system and shows a superior performance for audio examples which have been assembled to show the importance of auditory streaming in human perception.
Melody Extraction from Polyphonic Music Signals Using Tandem Filter System
- Engineering2018 International Computers, Signals and Systems Conference (ICOMSSC)
- 2018
The robust principal component analysis is used to roughly extract the human voice from polyphonic music signal using tandem filter and a transverse stripe filter system to eliminate the non-possible fundamental frequency position.
Vocal Melody Extraction via DNN-based Pitch Estimation and Salience-based Pitch Refinement
- Computer ScienceICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
- 2019
Experimental results on three public datasets indicate that the proposed melody MIDI files as the sources of labels to train a deep neural network (DNN) model for melody extraction outperforms four state-of-the-art melody extraction methods in most cases.
References
SHOWING 1-10 OF 29 REFERENCES
Singer melody extraction in polyphonic signals using source separation methods
- Computer Science2008 IEEE International Conference on Acoustics, Speech and Signal Processing
- 2008
A new approach for singer melody extraction, based on blind source separation techniques, and a simplification of this general GMM and approximate the STFT of the music signal using Non-negative Matrix Factorization (NMF) techniques.
An iterative approach to monaural musical mixture de-soloing
- Computer Science2009 IEEE International Conference on Acoustics, Speech and Signal Processing
- 2009
This article proposes to model the power spectral densities of both contributions with a source/filter model for the main instrument while retaining a model emphasizing temporal repetitions of the musical background, and shows that improved source separation performances can be obtained by a two-step estimation strategy.
EXTRACTION OF THE MELODY PITCH CONTOUR FROM POLYPHONIC AUDIO
- Computer Science
- 2005
This document describes the submission to the MIREX audio melody extraction contest addressing the task of identifying the melody pitch contour from polyphonic musical audio, and proves that the algorithm performs best in respect of runtime and overall accuracy.
Combining pitch-based inference and non-negative spectrogram factorization in separating vocals from polyphonic music
- PhysicsSAPA@INTERSPEECH
- 2008
A novel algorithm based on pitch estimation and nonnegative matrix factorization (NMF) that predicts the amount of noise in the vocal segments, which allows separating vocals and noise even when they overlap in time and frequency is proposed.
Adaptation of Bayesian Models for Single-Channel Source Separation and its Application to Voice/Music Separation in Popular Songs
- Computer ScienceIEEE Transactions on Audio, Speech, and Language Processing
- 2007
A general formalism for source model adaptation which is expressed in the framework of Bayesian models is introduced and results show that an adaptation scheme can improve consistently and significantly the separation performance in comparison with nonadapted models.
A robust predominant-F0 estimation method for real-time detection of melody and bass lines in CD recordings
- Computer Science2000 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.00CH37100)
- 2000
A predominant-F0 estimation method called PreFEst is proposed that does not rely on the F0's unreliable frequency component and obtains the most predominant F0 supported by harmonics within an intentionally limited frequency range.
Accompaniment separation and karaoke application based on automatic melody transcription
- Art2008 IEEE International Conference on Multimedia and Expo
- 2008
A method for separating accompaniment from polyphonic music and its karaoke application, both based on automatic melody transcription, which will help non-professional singers to produce more appealing k Karaoke performances.
Tracking melody in polyphonic audio . MIREX 2008
- Computer Science
- 2008
In this work a melody extraction technique is introduced to the MIREX 2008 campaign. The task’s objective consists in estimating the pitch of the main melody in polyphonic audio. The proposed method…
A Classification Approach to Melody Transcription
- Computer ScienceISMIR
- 2005
This work presents a classification-based system for performing automatic melody transcription that makes no assumptions beyond what is learned from its training data, and shows that a Support Vector Machine melodic classifier produces results comparable to state of the art model-based transcription systems.
Transcription of the Singing Melody in Polyphonic Music
- Computer ScienceISMIR
- 2006
The method is based on multiple-F0 estimation followed by acoustic and musicological modeling, which produces a sequence of notes and rests as a transcription of the singing melody.