Text segmentation and topic tracking on broadcast news via a hidden Markov model approach

Abstract

Continuing progress in the automatic transcription of broadcast speech via speech recognition has raised the possibility of applying information retrieval techniques to the resulting (errorful) text. In this paper we describe a general methodology based on Hidden Markov Models and classical language modeling techniques for automatically inferring story boundaries (segmentation) and for retrieving stories relating to a specific topic (tracking). We will present in detail the features and performance of the Segmentation and Tracking systems submitted by Dragon Systems for the 1998 Topic Detection and Tracking evaluation.

Extracted Key Phrases

6 Figures and Tables

Statistics

0510'00'02'04'06'08'10'12'14'16
Citations per Year

63 Citations

Semantic Scholar estimates that this publication has 63 citations based on the available data.

See our FAQ for additional information.

Cite this paper

@inproceedings{Mulbregt1998TextSA, title={Text segmentation and topic tracking on broadcast news via a hidden Markov model approach}, author={Paul van Mulbregt and Ira Carp and Larry Gillick and Steve Lowe and Jon Yamron}, booktitle={ICSLP}, year={1998} }