Corpus ID: 12031289

Combining pitch-based inference and non-negative spectrogram factorization in separating vocals from polyphonic music

  title={Combining pitch-based inference and non-negative spectrogram factorization in separating vocals from polyphonic music},
  author={T. Virtanen and A. Mesaros and M. Ryyn{\"a}nen},
  • T. Virtanen, A. Mesaros, M. Ryynänen
  • Published in SAPA@INTERSPEECH 2008
  • Computer Science
  • This paper proposes a novel algorithm for separating vocals from polyphonic music accompaniment. Based on pitch estimation, the method first creates a binary mask indicating timefrequency segments in the magnitude spectrogram where harmonic content of the vocal signal is present. Second, nonnegative matrix factorization (NMF) is applied on the non-vocal segments of the spectrogram in order to learn a model for the accompaniment. NMF predicts the amount of noise in the vocal segments, which… CONTINUE READING
    79 Citations
    Towards Solving the Bottleneck of Pitch-based Singing Voice Separation
    • 2
    Multi-Stage Non-Negative Matrix Factorization for Monaural Singing Voice Separation
    • 46
    • PDF
    Vocal Separation using Singer-Vowel Priors Obtained from Polyphonic Audio
    • 4
    • PDF
    Singing voice separation with pre-learned dictionary and reconstructed voice spectrogram
    • 1
    Singing Voice Separation and Vocal F0 Estimation Based on Mutual Combination of Robust Principal Component Analysis and Subharmonic Summation
    • 31
    • PDF
    Adaptive filtering for music/voice separation exploiting the repeating musical structure
    • 74
    • PDF
    Joint Singing Voice Separation and F0 Estimation with Deep U-Net Architectures
    • 4
    • PDF
    Latent time-frequency component analysis: A novel pitch-based approach for singing voice separation
    • Xiu Zhang, W. Li, Bilei Zhu
    • Computer Science
    • 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
    • 2015
    • 4


    Monaural Sound Source Separation by Nonnegative Matrix Factorization With Temporal Continuity and Sparseness Criteria
    • T. Virtanen
    • Mathematics, Computer Science
    • IEEE Transactions on Audio, Speech, and Language Processing
    • 2007
    • 978
    • PDF
    Automatic Synchronization between Lyrics and Music CD Recordings Based on Viterbi Alignment of Segregated Vocal Signals
    • 69
    • PDF
    Automatic Transcription of Melody, Bass Line, and Chords in Polyphonic Music
    • 203
    • PDF
    Separation of sound sources by convolutive sparse coding
    • 98
    • PDF
    Adaptation of Bayesian Models for Single-Channel Source Separation and its Application to Voice/Music Separation in Popular Songs
    • 172
    • PDF
    A multipitch tracking algorithm for noisy speech
    • 294
    • PDF
    Algorithms for Non-negative Matrix Factorization
    • 7,124
    • PDF