An End-to-End Neural Network for Polyphonic Piano Music Transcription
We propose a new generative model for polyphonic music based on nonlinear Independent Subspace Analysis (ISA) and factorial Hidden Markov Models (HMM). ISA represents chord spectra as sums of note power spectra and note spectra as sums of instrument-dependent log-power spectra. HMM models note duration. Instrument-dependent parameters are learnt on solo excerpts and used to transcribe musical recordings as collections of notes with time-varying power and other descriptive parameters such as vibrato. We prove the relevance of our modeling assumptions by comparing them with true data distributions and by giving satisfying transcriptions of two duo recordings.