Singing-voice separation from monaural recordings using robust principal component analysis


Separating singing voices from music accompaniment is an important task in many applications, such as music information retrieval, lyric recognition and alignment. Music accompaniment can be assumed to be in a low-rank subspace, because of its repetition structure; on the other hand, singing voices can be regarded as relatively sparse within songs. In this paper, based on this assumption, we propose using robust principal component analysis for singing-voice separation from music accompaniment. Moreover, we examine the separation result by using a binary time-frequency masking method. Evaluations on the MIR-1K dataset show that this method can achieve around 1~1.4 dB higher GNSDR compared with two state-of-the-art approaches without using prior training or requiring particular features.

DOI: 10.1109/ICASSP.2012.6287816

Extracted Key Phrases

5 Figures and Tables

Citations per Year

154 Citations

Semantic Scholar estimates that this publication has 154 citations based on the available data.

See our FAQ for additional information.

Cite this paper

@article{Huang2012SingingvoiceSF, title={Singing-voice separation from monaural recordings using robust principal component analysis}, author={Po-Sen Huang and Scott Deeann Chen and Paris Smaragdis and Mark Hasegawa-Johnson}, journal={2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)}, year={2012}, pages={57-60} }