Mickael Rouvier

Learn More
This paper presents the LIUM open-source speaker diarization toolbox, mostly dedicated to broadcast news. This tool includes both Hierarchical Agglomerative Clustering using well-known measures such as BIC and CLR, and the new ILP clustering algorithm using i-vectors. Diarization systems are tested on the French evaluation data from ESTER, ETAPE and REPERE(More)
In this paper, we propose a new clustering model for speaker diarization. A major problem with using greedy agglomerative hierarchical clustering for speaker diariza-tion is that they do not guarantee an optimal solution. We propose a new clustering model, by redefining clustering as a problem of Integer Linear Programming (ILP). Thus an ILP solver can be(More)
This paper describes a multi-modal person recognition system for video broadcast developed for participating in the Defi-Repere challenge. The main track of this challenge targets the identification of all persons occurring in a video either in the audio modality (speakers) or the image modality (faces). This system is developed by the PERCOL team involving(More)
We propose to study speaker diarization from a collection of audio documents. The goal is to detect speakers appearing in several shows. In our approach, each show of the collection is processed separately before being processed collectively , to group speakers involved in several shows. Two clustering methods are studied for the overall processing of the(More)
Statistical classifiers operate on features that generally include both useful and useless information. These two types of information are difficult to separate in the feature domain. Recently, a new paradigm based on a Latent Factor Analysis (LFA) proposed a model decomposition into usefull and useless components. This method was successfully applied to(More)
Deep neural networks (DNN) are currently very successful for acoustic modeling in ASR systems. One of the main challenges with DNNs is unsupervised speaker adaptation from an initial speaker clustering, because DNNs have a very large number of parameters. Recently, a method has been proposed to adapt DNNs to speakers by combining speaker-specific(More)
Video genre classification is a challenging task in a global context of fast growing video collections available on the Internet. This paper presents a new method for video genre identification by audio analysis. Our approach relies on the combination of low and high level audio features. We investigate the discrimi-native capacity of features related to(More)
Video genre identification methods are frequently based on image or motion analysis, which are relatively time-consuming processes. Since such approaches are tractable by batch processing, as-soon-as-possible identification requires faster methods. In this paper, we investigate the use of audio-only methods for on-the-fly video classification. We propose to(More)