Text Independent Speaker Identification Using Automatic Acoustic Segmentation


This paper describes an acoustic class dependent technique for text independent speaker identification on very short utterances. The technique is based on maximum likelihood estimation of a Gaussian mixture model representation of speaker identity. Gaussian mixtures are noted for their robustness as a parametric model and their ability to form smooth estimates of rather arbitrary underlying densities. Speaker model parameters are estimated using a special case of the iterative Expectation-Maximization (EM) algorithm [4], and a number of techniques are investigated for improving model robustness. The system waa evaluated using a 12 reference speaker population from a conversational speech database, and achieved 89% average text independent speaker identification performance for a 1 second test utterance length.

3 Figures and Tables

Citations per Year

114 Citations

Semantic Scholar estimates that this publication has 114 citations based on the available data.

See our FAQ for additional information.

Cite this paper

@inproceedings{Rose2004TextIS, title={Text Independent Speaker Identification Using Automatic Acoustic Segmentation}, author={Richard C. Rose and Douglas A. Reynolds}, year={2004} }