Gautham J. Mysore

Learn More
Supervised and semi-supervised source separation algorithms based on non-negative matrix factorization have been shown to be quite effective. However, they require isolated training examples of one or more sources, which is often difficult to obtain. This limits the practical applicability of these algorithms. We examine the problem of efficiently utilizing(More)
In this paper we present a novel approach for isolating and removing sounds from dense monophonic mixtures. The approach is user-based, and requires the presentation of a guide sound that mimics the desired target the user wishes to extract. The guide sound can be simply produced from a user by vocalizing or otherwise replicating the target sound marked for(More)
In recent years, there has been a great deal of work in modeling audio using non-negative matrix factorization and its probabilistic counterparts as they yield rich models that are very useful for source separation and automatic music transcription. Given a sound source, these algorithms learn a dictionary of spectral vectors to best explain it. This(More)
We present an algorithm based on probabilistic latent component analysis and employ it for relative pitch estimation of multiple instruments in polyphonic music. A multilayered positive deconvolution is performed concurrently on mixture constant-Q transforms to obtain a relative pitch track and timbral signature for each instrument. Initial experimental(More)
Source separation models that make use of nonnegativity in their parameters have been gaining increasing popularity in the last few years, spawning a significant number of publications on the topic. Although these techniques are conceptually similar to other matrix decompositions, they are surprisingly more effective in extracting perceptually meaningful(More)
We propose a method to both identify and synchronize multi-camera video recordings within a large collection of video and/or audio files. Landmark-based audio fingerprinting is used to match multiple recordings of the same event together and time-synchronize each file within the groups. Compared to prior work, we offer improvements towards event(More)
A musical score provides a great deal of information about a piece of music. In this paper we consider the incorporation of a music score to guide source separation on a single channel recording. We propose a method based on synthesizing lines of music in the score. Dynamic time warping (DTW) is used to to fit the synthesized data to the recording. These(More)
In this work, we investigate a method for score-informed source separation using Probabilistic Latent Component Analysis (PLCA). We present extensive test results that give an indication of the performance of the method, its strengths and weaknesses. For this purpose, we created a test database that has been made available to the public, in order to(More)
We present a semi-supervised source separation methodology to denoise speech by modeling speech as one source and noise as the other source. We model speech using the recently proposed non-negative hidden Markov model, which uses multiple non-negative dictionaries and a Markov chain to jointly model spectral structure and temporal dynamics of speech. We(More)
Non-negative spectrogram factorization algorithms such as probabilistic latent component analysis (PLCA) have been shown to be quite powerful for source separation. When training data for all of the sources are available, it is trivial to learn their dictionaries beforehand and perform supervised source separation in an online fashion. However, in many(More)