Automatic Music Transcription and Audio Source Separation


2 In this article, we give an overview of a range of approaches to the analysis and separation of musical audio. In particular, we consider the problems of automatic music transcription and audio source separation, which are of particular interest to our group. Monophonic music transcription, where a single note is present at one time, can be tackled using an autocorrelation-based method. For polyphonic music transcription, with several notes at any time, other approaches can be used, such as a blackboard model or a multiple-cause/sparse coding method. The latter is based on ideas and methods related to independent component analysis (ICA), a method for sound source separation. scene analysis 3 Over the last decade or so, and particularly since the publication of Bregman's seminal book on Auditory Scene Analysis (Bregmann 1990), there has been an increasing interest in the problem of Computational Auditory Scene Analysis (CASA): how to design computer-based models that can analyze an auditory scene. Imagine you are standing in a busy street among a crowd of people. You can hear traffic noise, footsteps of people nearby, the bleeping of a pedestrian crossing, your mobile phone ringing, and colleagues behind you having a conversation. Despite all these different sound sources, you have a pretty good idea of what is going on around you. It is more than just a mess of overlapping noise, and if you try hard you can concentrate on one of these sources if it is important to you (such as the conversation behind you). This has proved to be a very difficult problem. It requires both separation of many sound sources, and analysis of the content of these sources. However, a few authors have begun to tackle this problem in recent years, with some success (see e.g. Ellis 1996). One particular aspect of auditory scene analysis of interest to our group is automatic music transcription. Here, the sound sources are one or more instruments playing a piece of music, and we wish to analyze this to identify the instruments that are playing, and when and for how long each note is played. From this analysis we should then be able to produce a written musical score that shows notes and the duration of each on a written conventional music notation (for conventional western music, at least). In principle, this musical score could then be used to recreate the musical piece that was played. We …

DOI: 10.1080/01969720290040777


Citations per Year

95 Citations

Semantic Scholar estimates that this publication has 95 citations based on the available data.

See our FAQ for additional information.

Cite this paper

@article{Plumbley2002AutomaticMT, title={Automatic Music Transcription and Audio Source Separation}, author={Mark D. Plumbley and Samer A. Abdallah and Juan Pablo Bello and Mike E. Davies and Giuliano Monti and Mark B. Sandler}, journal={Cybernetics and Systems}, year={2002}, volume={33}, pages={603-627} }