• Corpus ID: 196183218

Musical source separation by coherent frequency modulation cues

  title={Musical source separation by coherent frequency modulation cues},
  author={Elliot Creager},
This thesis explores the extraction of vibrato sounds from monaural excerpts of polyphonic music using the coherent frequency modulation (CFM) of component partials as a grouping cue. Nonnegative Matrix Factorization (NMF) is currently a popular tool for musical source separation, since it can provide a low-rank approximate factorization of the magnitude spectrogram of the analyzed sound, where the factors can be interpreted as the spectral templates and temporal activations of the notes… 
Nonnegative Tensor Factorization with Frequency Modulation Cues for Blind Audio Source Separation
This approach extends Nonnegative Matrix Factorization for audio modeling by including local estimates of frequency modulation as cues in the separation of vibrato or glissando musical sources, which is not possible with the basic matrix factorization formulation.
Single-Channel Blind Source Separation for Singing Voice Detection: A Comparative Study
This paper investigates three promising BASS approaches which operate through a morphological filtering of the analyzed mixture spectrogram and proposes an extension of the KAM method for which a novel training algorithm used to compute a source-specific kernel from a given isolated source signal.
It is shown that accurate estimation can be obtained when the sets of atoms for which the inner product is large are disjoint, and a technique for designing once-differentiable approximations to windows is presented.


Monaural Sound Source Separation by Nonnegative Matrix Factorization With Temporal Continuity and Sparseness Criteria
  • T. Virtanen
  • Computer Science
    IEEE Transactions on Audio, Speech, and Language Processing
  • 2007
An unsupervised learning algorithm for the separation of sound sources in one-channel music signals is presented and enables a better separation quality than the previous algorithms.
Monaural Musical Sound Separation Based on Pitch and Common Amplitude Modulation
A sinusoidal modeling-based separation system that can effectively resolve overlapping harmonics based on the observations that harmonics of the same source have correlated amplitude envelopes and that the change in phase of a harmonic is related to the instrument's pitch is introduced.
Non-negative tensor factorisation of modulation spectrograms for monaural sound source separation
The method overcomes the limitation of conventional non-negative matrix factorisation algorithms to utilise the redundancy of sounds in frequency and synthesise sounds separated by filtering the mixture signal with a Wiener-like filter generated from the estimated tensor factors.
Phase constrained complex NMF: Separating overlapping partials in mixtures of harmonic musical sources
  • James Bronson, P. Depalle
  • Physics
    2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
  • 2014
A physically motivated phase constraint based on the assumption that the source's pitch is sufficient in specifying the phase evolution of the harmonics over time is introduced, uniting sinusoidal modelling of acoustic sources with the CMF analysis of their spectral representations.
Segregation of concurrent sounds. II: Effects of spectral envelope tracing, frequency modulation coherence, and frequency modulation width.
In this experiment, two types of vowel synthesis were used in which the component amplitudes of each vowel either remained constant with frequency modulation or traced the spectral envelope, and it would seem that spectral tracing has no effect on grouping components of sound sources.
Singing Voice Separation Using Spectro-Temporal Modulation Features
A n auditory-perception inspired singing voice separation algorithm for monaural music recordings is proposed and a set of novel features gives a possible explanation on how the auditory cortex analyzes and identifies singing voice in music audio mixtures.
Probabilistic Inference of Speech Signals from Phaseless Spectrograms
A generative model of time-domain speech signals and their spectrograms is described, and it is shown how an efficient optimizer can be used to find the maximum a posteriori speech signal, given the spectrogram.
Multichannel High-Resolution NMF for Modeling Convolutive Mixtures of Non-Stationary Signals in the Time-Frequency Domain
The HR-NMF model is extended to multichannel signals and to convolutive mixtures, and a fast variational expectation-maximization (EM) algorithm is proposed to estimate the enhanced model.
NMF With Time–Frequency Activations to Model Nonstationary Audio Events
An extension of non-negative matrix factorization where the temporal activations become frequency dependent and follow a time-varying autoregressive moving average (ARMA) modeling leads to an efficient single-atom decomposition for a single audio event with strong spectral variation (but with constant pitch).
Source Separation of Polyphonic Music with Interactive User-Feedback on a Piano Roll Display
Initial results show that the method of interactive feedback can significantly increase the separation quality and produce high-quality separation results.