Sparse Representations in Audio and Music: From Coding to Source Separation
@article{Plumbley2010SparseRI, title={Sparse Representations in Audio and Music: From Coding to Source Separation}, author={Mark D. Plumbley and Thomas Blumensath and Laurent Daudet and R{\'e}mi Gribonval and Mike E. Davies}, journal={Proceedings of the IEEE}, year={2010}, volume={98}, pages={995-1005} }
Sparse representations have proved a powerful tool in the analysis and processing of audio signals and already lie at the heart of popular coding standards such as MP3 and Dolby AAC. In this paper we give an overview of a number of current and emerging applications of sparse representations in areas from audio coding, audio enhancement and music transcription to blind source separation solutions that can solve the ¿cocktail party problem.¿ In each case we will show how the prior assumption that…
236 Citations
Informed Audio Source Separation from Compressed Linear Stereo Mixtures
- Computer ScienceSemantic Audio
- 2011
This paper uses a MPEG-AAC codec and shows that the ISS process is quite robust to compression, opening the way for ''real-world'' karaoke/soloing/remixing applications for downloadable music.
Methods of Single-Channel Music Source Separation
- Art
- 2010
Music source separation refers to the process of recovering original music sources from a mixture of two or more musical sound sources. Although music source separation is important even when the…
Sparse denoising of audio by greedy time-frequency shrinkage
- Computer Science, Engineering2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
- 2014
This work presents an analysis of MP in the context of audio denoising, by interpreting the algorithm as a simple shrinkage approach, and proposes several approaches to improve its performance and robustness.
Audio Denoising by Generalized Time-Frequency Thresholding
- Computer Science
- 2012
In audio processing, different collections of windowed Fourier or cosine bases have proven to serve as well adapted dictionaries for most audio signals of relevance for humans, in particular speech and music.
Sparse and structured decomposition of audio signals on hybrid dictionaries using musical priors.
- Computer ScienceThe Journal of the Acoustical Society of America
- 2013
Evaluation on monophonic and complex polyphonic excerpts of real music signals shows that the proposed approach provides results whose quality measured by the signal-to-noise ratio is competitive with state-of-the-art approaches, and more coherent with the semantic content of the signal.
A Multichannel Audio Denoising Formulation Based on Spectral Sparsity
- Computer ScienceIEEE/ACM Transactions on Audio, Speech, and Language Processing
- 2015
This work considers the estimation of an audio source from multiple noisy observations, where the correlation between noise in the different observations is low and proposes a two-stage method that assumes that the signal of interest has a sparse time-frequency representation.
"Sparsification" of Audio Signals Using the MDCT/IntMDCT and a Psychoacoustic Model - Application to Informed Audio Source Separation
- Computer ScienceSemantic Audio
- 2011
This paper revisits the irrelevance filtering analysis-synthesis approach and applies the sparsification process to the informed source separation (ISS) problem and shows that it enables to significantly decrease the computational cost at the ISS decoder.
Lossy audio signal compression via structured sparse decomposition and compressed sensing
- Computer Science2014 IEEE International Conference on Multimedia and Expo (ICME)
- 2014
A least absolute shrinkage and selection operator (LASSO) is employed to sparse and structured decompose the audio signals into tonal and transient layers, and then, both resulting layers are compressed by a CS method.
Investigating the Potential of Pseudo Quadrature Mirror Filter-Banks in Music Source Separation Tasks
- Computer ScienceArXiv
- 2017
This work investigates the potential of an optimized pseudo quadrature mirror filter-bank (PQMF), as a T-F representation for music source separation tasks and suggests that the PQMF maintains the aforementioned desirable properties and can be regarded as an alternative for representing mixtures of musical signals.
An overview of informed audio source separation
- Computer Science2013 14th International Workshop on Image Analysis for Multimedia Interactive Services (WIAMIS)
- 2013
In recent years, much research has focused on informed separation, which consists in using additional available information about the sources to improve the separation quality.
References
SHOWING 1-10 OF 66 REFERENCES
Audio Signal Representations for Indexing in the Transform Domain
- Computer ScienceIEEE Transactions on Audio, Speech, and Language Processing
- 2010
This new audio codec allows efficient transform-domain audio indexing for three different applications, namely beat tracking, chord recognition, and musical genre classification and is compared with the two standard MP3 and AAC codecs in terms of performance and computation time.
Sparse Overcomplete Decomposition for Single Channel Speaker Separation
- Computer Science2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07
- 2007
An algorithm for separating multiple speakers from a mixed single channel recording based on a model proposed by Raj and Smaragdis (2005) and a probabilistic framework to achieve sparsity is proposed.
Low Bit-Rate Object Coding of Musical Audio Using Bayesian Harmonic Models
- Computer ScienceIEEE Transactions on Audio, Speech, and Language Processing
- 2007
This work proposes a family of probabilistic signal models combining learned object priors and various perceptually motivated distortion measures for very low bit-rate coding purposes and designs efficient algorithms to infer object parameters and builds a coder based on the interpolation of frequency and amplitude parameters.
Audio source separation with a single sensor
- Computer ScienceIEEE Transactions on Audio, Speech, and Language Processing
- 2006
This paper addresses the problem of audio source separation with one single sensor, using a statistical model of the sources, based on a learning step from samples of each source separately, during which Gaussian scaled mixture models (GSMM) are trained.
Union of MDCT Bases for Audio Coding
- Computer ScienceIEEE Transactions on Audio, Speech, and Language Processing
- 2008
This paper investigates the use of sparse overcomplete decompositions for audio coding by using a bitplane encoding approach, which provides a fine-grain scalable coder that can seamlessly operate from very low bitrates up to transparency.
Underdetermined blind source separation using sparse representations
- Computer ScienceSignal Process.
- 2001
OBJECT CODING OF HARMONIC SOUNDS USING SPARSE AND STRUCTURE D REPRESENTATIONS
- Computer Science
- 2007
A novel object-based coding is presented, which allows the computation of objects in a reasonable computational time, and appears to perform better than transform and parametric coders onsolo or duo of harmonic instruments at 8kbit/s and 2 k bit/s.
Blind separation of dependent sources using the "time-frequency ratio of mixtures" approach
- Computer ScienceSeventh International Symposium on Signal Processing and Its Applications, 2003. Proceedings.
- 2003
The principles of the TIFROM approach are recalled and it is shown that, unlike independent component analysis methods, this approach can separate dependent signals, provided there exist some areas in the time-frequency plane where only one source occurs.