JFA for speaker recognition with random digit strings


In this paper, we examine the use of Joint Factor Analysis methods on RSR2015 digits. A tied-mixture model is used for segmentation of the utterances into digits, while Joint Factor Analysis and a Joint Density model are deployed for features and backend, respectively. A novel approach for digit-dependent fusion of UBM-component log-likelihood ratios is introduced, yielding the best results so far. The fusion of 5 different JFA features gives an equal-error rate of 3.6%, compared to 6.3% attained by the a baseline GMM-UBM model with score normalization.

6 Figures and Tables

Cite this paper

@inproceedings{Stafylakis2015JFAFS, title={JFA for speaker recognition with random digit strings}, author={Themos Stafylakis and Patrick Kenny and Md. Jahangir Alam and Marcel Kockmann}, booktitle={INTERSPEECH}, year={2015} }