Learn More
Suppose that you are at a music festival checking on an artist, and you would like to quickly know about the song that is being played (e.g., title, lyrics, album, etc.). If you have a smartphone, you could record a sample of the live performance and compare it against a database of existing recordings from the artist. Services such as Shazam or SoundHound(More)
Multi-pitch analysis of concurrent sound sources is an important but challenging problem. It requires estimating pitch values of all harmonic sources in individual frames and streaming the pitch estimates into trajectories, each of which corresponds to a source. We address the streaming problem for monophonic sound sources. We take the original audio, plus(More)
The Philips audio fingerprint[1] has been used for years, but its robustness against external noise has not been studied accurately. This paper shows the Philips fingerprint is noise resistant, and is capable of recognizing music that is corrupted by noise at a -4 to -7 dB signal to noise ratio. In addition, the drawbacks of the Philips fingerprint are(More)
High-level knowledge of language helps the human auditory system understand speech with missing information such as missing frequency bands. The automatic speech recognition community has shown that the use of this knowledge in the form of language models is crucial to obtaining high quality recognition results. In this paper, we apply this idea to the(More)
We propose a new approach for automatic melody extraction from polyphonic audio, based on Probabilistic Latent Component Analysis (PLCA).An audio signal is first divided into vocal and non-vocal segments using a trained Gaussian Mixture Model (GMM) classifier. A statistical model of the non-vocal segments of the signal is then learned adaptively from this(More)
Missing data in corrupted audio recordings poses a challenging problem for audio signal processing. In this paper we present an approach that allows us to estimate missing values in the time-frequency domain of audio signals. The proposed approach, based on the Non-negative Hidden Markov Model, enables more temporally coherent estimation for the missing(More)
Given a set of monophonic, harmonic sound sources (e.g. human voices or wind instruments), multi-pitch estimation (MPE) is the task of determining the instantaneous pitches of each source. Multi-pitch tracking (MPT) connects the instantaneous pitch estimates provided by MPE algorithms into pitch trajectories of sources. A trajectory can be short (within a(More)
In mixtures of musical sounds, the problem of overlapped harmonics poses a significant challenge to source separation. Common Amplitude Modulation (CAM) is one of the most effective methods to resolve this problem. It, however, relies on non-overlapped harmonics from the same note being available. We propose an alternate technique for harmonic envelope(More)