Learn More
In this paper, we propose an efficient model for automatic transcription of polyphonic music. The model extends the shift-invariant probabilistic latent component analysis method and uses pre-extracted and pre-shifted note templates from multiple instruments. Thus, the proposed system can efficiently transcribe polyphonic music, while taking into account(More)
In this paper, an efficient, general-purpose model for multiple instrument polyphonic music transcription is proposed. The model is based on probabilistic latent component analysis and supports the use of sound state spectral templates, which represent the temporal evolution of each note (e.g. attack, sustain, decay). As input, a variable-Q transform (VQT)(More)
Multimodal interfaces can open up new possibilities for music education, where the traditional model of teaching is based predominantly on verbal feedback. This paper explores the development and use of multimodal interfaces in novel tools to support music practice training. The design of multimodal interfaces for music education presents a challenge in(More)
In this paper, a method for multiple-instrument automatic music transcription is proposed that models the temporal evolution and duration of tones. The proposed model supports the use of spectral templates per pitch and instrument which correspond to sound states such as attack, sustain, and decay. Pitch-wise explicit duration hidden Markov models (EDHMMs)(More)
Automatic transcription of polyphonic music has been an active research field for several years and is considered by many to be a key enabling technology in music signal processing. However, current transcription approaches either focus on detecting pitched sounds (from pitched musical instruments) or on detecting unpitched sounds (from drum kits). In this(More)
Understanding how we relate and compare pieces of music has been a topic of great interest in musicology as well as for business applications, such as music recommender systems. The way music is compared seems to vary among both individuals and cultures. Adapting a generic model to user ratings is useful for personalisation and can help to better understand(More)
We present a novel method of classification and segmentation of melodies in symbolic representation. The method is based on filtering pitch as a signal over time with the Haar-wavelet, and we evaluate it on two tasks. The filtered signal corresponds to a single-scale signal w s from the continuous Haar wavelet transform. The melodies are first segmented(More)
We investigate the problem of incorporating higher-level symbolic score-like information into Automatic Music Transcription (AMT) systems to improve their performance. We use recurrent neural networks (RNNs) and their variants as music language models (MLMs) and present a generative architecture for combining these models with predictions from a frame level(More)