Learn More
An unsupervised learning algorithm for the separation of sound sources in one-channel music signals is presented. The algorithm is based on factorizing the magnitude spectrogram of an input signal into a sum of components, each of which has a fixed magnitude spectrum and a time-varying gain. Each sound source, in turn, is modeled as a sum of one or more(More)
This paper proposes to use exemplar-based sparse representations for noise robust automatic speech recognition. First, we describe how speech can be modeled as a linear combination of a small number of exemplars from a large speech exemplar dictionary. The exemplars are time-frequency patches of real speech, each spanning multiple time frames. We then(More)
We introduce TUT Acoustic Scenes 2016 database for environmental sound research, consisting of binaural recordings from 15 different acoustic environments. A subset of this database, called TUT Sound Events 2016, contains annotations for individual sound events, specifically created for sound event detection. TUT Sound Events 2016 consists of residential(More)
We describe the underlying probabilistic generative signal model of non-negative matrix factorisation (NMF) and propose a realistic conjugate priors on the matrices to be estimated. A conjugate Gamma chain prior enables modelling the spectral smoothness of natural sounds in general, and other prior knowledge about the spectra of the sounds can be used(More)
This paper proposes a computationally efficient algorithm for estimating the non-negative weights of linear combinations of the atoms of large-scale audio dictionaries, so that the generalized Kullback-Leibler divergence between an audio observation and the model is minimized. This linear model has been found useful in many audio signal processing tasks,(More)
Sound source separation refers to the task of estimating the signals produced by individual sound sources from a complex acoustic mixture. It has several applications, since monophonic signals can be processed more efficiently and flexibly than polyphonic mixtures. This thesis deals with the separation of monaural, or, one-channel music recordings. We(More)
This paper presents a procedure for the separation of pitched musical instruments and drums from polyphonic music. The method is based on two-stage processing in which the input signal is first separated into elementary time-frequency components which are then organized into sound sources. Non-negative matrix factorization (NMF) is used to separate the(More)
In this paper, an approach for the separation of harmonic sounds is described. The overall system consists of three components. Sinusoidal modeling is first used to analyze the mixed signal and to obtain the frequencies and amplitudes of sinusoidal spectral components. Then a new method is proposed for the calculation of the perceptual distance between(More)
The work presented in this article studies how the context information can be used in the automatic sound event detection process, and how the detection system can benefit from such information. Humans are using context information to make more accurate predictions about the sound events and ruling out unlikely events given the context. We propose a similar(More)