Learn More
A new method to detect words that are likely to be confused by speech recognition systems is presented in this letter. A new dissimilarity measure between two words is calculated in two steps. First, the phonetic transcriptions of the words are aligned using only phonetic information. Two kinds of alignments are used: either with or without insertions and(More)
Voice imitation is one of the potential threats to security systems that use automatic speaker recognition. Since prosodic features have been considered for state-of-the-art recognition systems in recent years, the question arises as to how vulnerable these features are to voice mimicking. In this study, two experiments are conducted for twelve individual(More)
In this paper, we address the modality integration issue on the example of a smart room environment aiming at enabling person identification by combining acoustic features and 2D face images. First we introduce the monomodal audio and video identification techniques and then we present the use of combined input speech and face images for person(More)
Prosody plays an important role in the human recognition process; therefore, prosodic elements are normally used by impersonators aiming to resemble someone else. Since such voice imitation is one of the potential threats to security systems relying on automatic speaker recognition, and prosodic features have been considered for state-of-the-art recognition(More)
SUMMARY Jacobian Adaptation (JA) has been successfully used in Automatic Speech Recognition (ASR) systems to adapt the acoustic models from the training to the testing noise conditions. In this work we present an improvement of JA for speaker verification, where a specific training noise reference is estimated for each speaker model. The new proposal, which(More)