Ran D. Zilca

Learn More
This paper presents an overview of the architecture and algorithms implemented in IBM's text-independent speaker veriication system developed for the 2002 NIST Speaker Recognition Evaluation, particularly for the 1-speaker detection task using cellular test data. We describe individual components including a Gaussianization front-end, celluar-codec(More)
The fine spectral structure related to pitch information is conveyed in Mel cepstral features, with variations in pitch causing variations in the features. For speaker recognition systems, this phenomenon, known as “pitch mismatch” between training and testing, can increase error rates. Likewise, pitch-related variability may potentially(More)
Although the last decade has witnessed mounting research on the development and evaluation of positive interventions, investigators still know little about the target population of such interventions: happiness seekers. The present research asked three questions about happiness seekers: (1) What are their general characteristics?, (2) What do they(More)
The use of feature vectors obtained by concatenation of different features for text independent speaker identification from clean and telephone speech is studied. The composite feature vectors are examined with GMM and VQ models used to classify speakers. Linear discriminant analysis (LDA), a statistical tool designed to select a reduced set of features for(More)
—This paper describes a computationally simple method to perform text independent speaker verification using second order statistics. The suggested method, called utterance level scoring (ULS), allows obtaining a normalized score using a single pass through the frames of the tested utterance. The utterance sample covariance is first calculated and then(More)
The paper considers text independent speaker identification over the telephone using short training and testing data. Gaussian Mixture Modeling (GMM) is used in the testing phase, but the parameters of the model are taken from clusters obtained for the training data by an adequate choice of feature vectors and a distance measure without optimization in the(More)
In large-scale deployments of speaker recognition systems the potential for legacy problems increases as the evolving technology may require configuration changes in the system thus invalidating already existing user voice accounts. Unless the entire database of original speech waveform were stored, users need to reenroll to keep their accounts functional ,(More)