Polyphonic transcription by non-negative sparse coding of power spectra

Abstract

We present a system for adaptive spectral basis decomposition that learns to identify independent spectral features given a sequence of short-term Fourier spectra. When applied to recordings of polyphonic piano music, the individual notes are identified as salient features, and hence each short-term spectrum is decomposed into a sum of note spectra; the resulting encoding can be used as a basis for polyphonic transcription. The system is based on a probabilistic model equivalent to a form of noisy independent component analysis (ICA) or sparse coding with non-negativity constraints. We introduce a novel modification to this model that recognises that a short-term Fourier spectrum can be thought of as a noisy realisation of the power spectral density of an underlying Gaussian process, where the noise is essentially multiplicative and non-Gaussian. Results are presented for an analysis of a live recording of polyphonic piano music.

Extracted Key Phrases

5 Figures and Tables

0102030'04'06'08'10'12'14'16
Citations per Year

172 Citations

Semantic Scholar estimates that this publication has 172 citations based on the available data.

See our FAQ for additional information.

Cite this paper

@inproceedings{Abdallah2004PolyphonicTB, title={Polyphonic transcription by non-negative sparse coding of power spectra}, author={Samer M. Abdallah and Mark D. Plumbley}, booktitle={ISMIR}, year={2004} }