This article presents several techniques to combine between Support vector machines (SVM) and Joint Factor Analysis (JFA) model for speaker verification. In this combination, the SVMs are applied to different sources of information produced by the JFA. These infor-mations are the Gaussian Mixture Model supervectors and speakers and Common factors. We found… (More)
We propose a novel design for acoustic feature-based automatic spoken language recognizers. Our design is inspired by recent advances in text-independent speaker recognition, where intra-class variability is modeled by factor analysis in Gaussian mixture model (GMM) space. We use approximations to GMM-likelihoods which allow variable-length data sequences… (More)
This paper describes the acoustic language recognition subsystems of Brno University of Technology (BUT) which contributed to the BUT main submission to the NIST LRE 2007. Two main techniques are employed in the subsystems discrim-inative training in terms of Maximum Mutual Information, and channel compensation in terms of eigenchannel adaptation in both,… (More)
This paper summarizes the BUT-AGNITIO system for NIST Language Recognition Evaluation 2009. The post-evaluation analysis aimed mainly at improving the quality of the data (fixing language label problems and detecting overlapping speakers in the training and development sets) and investigation of different compositions of the development set. The paper… (More)
This paper describes Brno University of Technology (BUT) system for 2007 NIST Language recognition (LRE) evaluation. The system is a fusion of 4 acoustic and 9 phonotactic subsystems. We have investigated several new topics such as dis-criminatively trained language models in phonotactic systems, and eigen-channel adaptation in model and feature domain in… (More)
This paper presents BUT system submitted to NIST 2008 SRE. It includes two subsystems based on Joint Factor Analysis (JFA) GMM/UBM and one based on SVM-GMM. The systems were developed on NIST SRE 2006 data, and the results are presented on NIST SRE 2008 evaluation data. We concentrate on the influence of side information in the calibration.
In this paper, we have investigated into JFA used for speaker recognition. First, we performed systematic comparison of full JFA with its simplified variants and confirmed superior performance of the full JFA with both eigenchannels and eigen-voices. We investigated into sensitivity of JFA on the number of eigenvoices both for the full one and simplified… (More)
Gender and age estimation based on Gaussian Mixture Models (GMM) is introduced. Telephone recordings from the Czech SpeechDat-East database are used as training and test data set. Mel-Frequency Cepstral Coefficients (MFCC) are extracted from the speech recordings. To estimate the GMMs' parameters Maximum Likelihood (ML) training is applied. Consequently… (More)