• Publications
  • Influence
The ICSI Meeting Corpus
TLDR
A corpus of data from natural meetings that occurred at the International Computer Science Institute in Berkeley, California over the last three years is collected, which supports work in automatic speech recognition, noise robustness, dialog modeling, prosody, rich transcription, information retrieval, and more.
Acoustic Beamforming for Speaker Diarization of Meetings
TLDR
The use of classic acoustic beamforming techniques is proposed together with several novel algorithms to create a complete frontend for speaker diarization in the meeting room domain and shows improvements in a speech recognition task.
The ICSI RT07s Speaker Diarization System
TLDR
This paper used the most recent available version of the beam-forming toolkit, implemented a new speech/non-speech detector that does not require models trained on meeting data and performed the development on a much larger set of recordings.
A robust speaker clustering algorithm
TLDR
The algorithm automatically performs both speaker segmentation and clustering without any prior knowledge of the identities or the number of speakers and has the following advantages: no threshold adjustment requirements; no need for training/development data; and robustness to different data conditions.
Building a Large Lexical Databank Which Provides Deep Semantics
TLDR
The database will show the semantic and syntactic combinatorial possibilities (based on frame membership) of the lexical items it includes, as these are documented through grammatical and semantic annotations of sentences extracted from a large corpus of contemporary written English.
Speech Recognition for Illiterate Access to Information and Technology
TLDR
This paper presents an inexpensive approach for gathering the linguistic resources needed to power a simple spoken dialog system and addresses the unique social and economic challenges of the developing world by relying on modifiable and highly transparent software and hardware.
TOWARDS ROBUST SPEAKER SEGMENTATION: THE ICSI-SRI FALL 2004 DIARIZATION SYSTEM
TLDR
The ICSI-SRI system is an agglomerative clustering system that uses a BIC-like measure to determine when to stop merging clusters and to decide which pairs of clusters to merge, providing robustness and portability.
Robust speaker diarization for meetings: ICSI RT06s evaluation system
TLDR
Four of the main improvements to the ICSI speaker diarization system submitted for the NIST Rich Transcription evaluation (RT06s) conducted on the meetings environment are introduced: a new training-free speech/non-speech detection algorithm, a new algorithm for system initialization, and a frame purification algorithm to increase clusters differentiability.
Speaker Diarization For Multiple-Distant-Microphone Meetings Using Several Sources of Information
TLDR
The correlation between signals coming from multiple microphones is analyzed and an improved method for carrying out speaker diarization for meetings with multiple distant microphones is proposed, improving the Diarization Error Rate (DER) by 15% to 20% relative to previous systems.
...
1
2
3
4
5
...