Polyphonic piano note transcription with non-negative matrix factorization of differential spectrogram


Automatic music transcription is usually approached by using a time-frequency (TF) representation such as the short-time Fourier transform (STFT) spectrogram or the constant-Q transform. In this paper, we propose a novel yet simple TF representation that capitalizes the effectiveness of spectral flux features in highlighting note onset times. We refer to this representation as the differential spectrogram and investigate its usefulness for note-level piano transcription using two different non-negative matrix factorization (NMF) algorithms. Experiments on the MAPS ENSTDkCl dataset validate the advantages of the differential spectrogram over the STFT spectrogram for this task. Moreover, by adapting a state-of-the-art convolutional NMF algorithm with the differential spectrogram, we can achieve even better accuracy than the state-of-the-art on this dataset. Our analysis shows that the new representation suppresses unwanted TF patterns and performs particularly well in improving the recall rate.

DOI: 10.1109/ICASSP.2017.7952164
Showing 1-10 of 27 references

An attaekldeeay model for piano transcription

  • T. Cheng, M. Maueh, E. Benetos, S. Dixon
  • 2016
Highly Influential
10 Excerpts

Polyphonie piano note transcription with reeurrent neural networks

  • S. Böek, M. Sehedl
  • 2012
Highly Influential
4 Excerpts

Multipiteh estimation of piano sounds using a new probabilistie speetral smoothness prineiple

  • V. Emiya, VR Emiya, B David
  • 2010
Highly Influential
4 Excerpts