Deepu Vijayasenan

Learn More
A speaker diarization system based on an information theoretic framework is described. The problem is formulated according to the information bottleneck (IB) principle. Unlike other approaches where the distance between speaker segments is arbitrarily introduced, the IB method seeks the partition that maximizes the mutual information between observations(More)
The speaker diarization task consists of inferring “who spoke when” in an audio stream without any prior knowledge and has been object of several NIST international evaluation campaigns is last years. A common trend for improving performances has been the use of several different feature streams as diverse as speaker location features, visual features or(More)
In this paper, we investigate the use of agglomerative Information Bottleneck (aIB) clustering for the speaker diarization task of meetings data. In contrary to the state-of-the-art diarization systems that models individual speakers with Gaussian Mixture Models, the proposed algorithm is completely non parametric . Both clustering and model selection(More)
This correspondence describes a novel system for speaker diarization of meetings recordings based on the combination of acoustic features (MFCC) and time delay of arrivals (TDOAS). The first part of the paper analyzes differences between MFCC and TDOA features which possess completely different statistical properties. When Gaussian mixture models are used,(More)
This paper describes Lipi Toolkit (LipiTk) - a generic toolkit whose aim is to facilitate development of online handwriting recognition engines for new scripts, and simplify integration of the resulting engines into real-world application contexts. The toolkit provides robust implementations of tools, algorithms, scripts and sample code necessary to support(More)
In this paper we address the combination of multiple feature streams in a fast speaker diarization system for meeting recordings. Whenever Multiple Distant Microphones (MDM) are used, it is possible to estimate the Time Delay of Arrival (TDOA) for different channels. In [9], it is shown that TDOA can be used as additional features together with conventional(More)
Speaker diarization for meetings data are recently converging towards multistream systems. The most common complementary features used in combination with MFCC are Time Delay of Arrival (TDOA). Also other features have been proposed although, there are no reported improvements on top of MFCC+TDOA systems. In this work we investigate the combination of other(More)
Speaker diarization of meetings recorded with Multiple Distant Microphones makes extensive use of multiple feature streams like MFCC and Time Delay of Arrivals (TDOA). Typically the combination happens using separate models for each feature stream. This work investigates if the combination of multiple feature streams can happen through the combination of(More)