Learn More
We present``Transcriber'', a tool for assisting in the creation of speech corpora, and describe some aspects of its development and use. Transcriber was designed for the manual segmentation and transcription of long duration broadcast news recordings, including annotation of speech turns, topics and acoustic conditions. It is highly portable, relying on the(More)
This paper describes recent advances in speaker diarization with a multistage segmentation and clustering system, which incorporates a speaker identification step. This system builds upon the baseline audio partitioner used in the LIMSI broadcast news transcription system. The baseline partitioner provides a high cluster purity, but has a tendency to split(More)
Acoustic speaker diarization is investigated for situations where a collection of shows from the same source needs to be processed. In this case, the same speaker should receive the same label across all shows. We compare different architectures for cross-show speaker diarization: the obvious concatenation of all shows, a hybrid system combining first a(More)
Transcriber is a tool for manual annotation of large speech files. It was originally designed for the broadcast news transcription task. The annotation file format was derived from previous formats used for this task, and many related features were hard-coded. In this paper we present a generalization of the tool based on the annotation graph formalism, and(More)
This paper describes the first version of " Transcriber " , a tool for segmenting, labeling and transcribing speech. It is developed under Unix in the Tcl/Tk script language with extensions in C, and is available as free software. The environment offers the basic functions necessary for segmenting, labeling and transcribing long duration signals. The signal(More)
One particularly difficult challenge for cross-channel MLLR (CMLLR) are two widely-used techniques for speaker introduced in the 2005 and 2006 NIST Speaker Recognition Evaluations, where training uses telephone speech and verification uses speech from multiple auxiliary comparable to that obtained with cepstral features. This paper describes a new feature(More)
We propose an approach for unsupervised speaker identification in TV broadcast videos, by combining acoustic speaker diarization with person names obtained via video OCR from overlaid texts. Three methods for the propagation of the overlaid names to the speech turns are compared, taking into account the co-occurence duration between the speaker clusters and(More)
The LIMSI RT-07S speaker diarization system for the conference and lecture meetings is presented in this paper. This system builds upon the RT-06S diarization system designed for lecture data. The baseline system combines agglomerative clustering based on Bayesian information criterion (BIC) with a second clustering using state-of-the-art speaker(More)