Ricard Marxer

Learn More
The CHiME challenge series aims to advance far field speech recognition technology by promoting research at the interface of signal processing and automatic speech recognition. This paper presents the design and outcomes of the 3rd CHiME Challenge, which targets the performance of automatic speech recognition in a real-world, commercially-motivated(More)
We introduce a generic model of emergence of musical categories during the listening process. The model is based on a preprocessing and a categorization module. Preprocessing results in a perceptually plausible representation of music events extracted from audio or symbolic input. The categorization module lets a taxonomy of musical entities emerge(More)
A causal system to represent a stream of music into musical events, and to generate further expected events, is presented. Starting from an auditory front-end which extracts low-level (i.e. MFCC) and mid-level features such as onsets and beats, an unsupervised clustering process builds and maintains a set of symbols aimed at representing musical stream(More)
This research focuses on the removal of the singing voice in polyphonic audio recordings under real-time constraints. It is based on time-frequency binary masks resulting from the combination of azimuth, phase difference and absolute frequency spectral bin classification and harmonic-derived masks. For the harmonic-derived masks, a pitch likelihood(More)
We report here about our submissions to different music classification tasks for the MIREX 2010 evaluations. These submissions are similar to the ones sent at MIREX 2009 (see [1]), if we look at the classifiers and the main audio features. However we added high-level features (or semantic features), based on Support Vector Machine models of curated(More)
The artful play with the listener's expectations is one of the supreme skills of a gifted musician. We present a system that analyzes an audio signal in an unsupervised manner in order to generate a musical representation of it on-the-fly. The system performs the task of next note prediction using the emerged representation. The main difference between our(More)
We present a method for lead instrument separation using an available musical score that may not be properly aligned with the polyphonic audio mixture. Improper alignment degrades the performance of existing score-informed source separation algorithms. Several techniques are proposed to manage local and global misalignments, such as a score information(More)