Learn More
Suppose that you are at a music festival checking on an artist, and you would like to quickly know about the song that is being played (e.g., title, lyrics, album, etc.). If you have a smart-phone, you could record a sample of the live performance and compare it against a database of existing recordings from the artist. Services such as Shazam or SoundHound(More)
We propose a new approach for automatic melody extraction from polyphonic audio, based on Probabilistic Latent Component Analysis (PLCA). An audio signal is first divided into vocal and non-vocal segments using a trained Gaussian Mixture Model (GMM) classifier. A statistical model of the non-vocal segments of the signal is then learned adaptively from this(More)
Missing data in corrupted audio recordings poses a challenging problem for audio signal processing. In this paper we present an approach that allows us to estimate missing values in the time-frequency domain of audio signals. The proposed approach, based on the Non-negative Hidden Markov Model, enables more temporally coherent estimation for the missing(More)
Given a set of monophonic, harmonic sound sources (e.g. human voices or wind instruments), multi-pitch estimation (MPE) is the task of determining the instantaneous pitches of each source. Multi-pitch tracking (MPT) connects the instantaneous pitch estimates provided by MPE algorithms into pitch trajectories of sources. A trajectory can be short (within a(More)
Multi-pitch analysis of concurrent sound sources is an important but challenging problem. It requires estimating pitch values of all harmonic sources in individual frames and streaming the pitch estimates into trajectories, each of which corresponds to a source. We address the streaming problem for monophonic sound sources. We take the original audio, plus(More)
In mixtures of musical sounds, the problem of overlapped harmonics poses a significant challenge to source separation. Common Amplitude Modulation (CAM) is one of the most effective methods to resolve this problem. It, however, relies on non-overlapped harmonics from the same note being available. We propose an alternate technique for harmonic envelope(More)
The VocalSearch system is a music search engine developed at Northwestern University and available on the internet (vocalsearch.org). This system lets the user query for the desired song in a number of ways: sung queries, queries entered as music notation, and text-based lyrics search. Users are also able to contribute songs to the system, making them(More)
Grapheme-to-phoneme (G2P) conversion is an important component in TTS systems. The difficulty in Chinese G2P conversion is to disambiguate the polyphones. In this paper, we formulate the polyphone disambiguation problem into a classification problem and propose a language independent classifier based on maximum entropy to address the issue. Furthermore, we(More)