Corpus ID: 16413356

Detecting Converted Speech and Natural Speech for anti-Spoofing Attack in Speaker Recognition

@inproceedings{Wu2012DetectingCS,
  title={Detecting Converted Speech and Natural Speech for anti-Spoofing Attack in Speaker Recognition},
  author={Zhizheng Wu and Chng Eng Siong and Haizhou Li},
  booktitle={INTERSPEECH},
  year={2012}
}
Voice conversion techniques present a threat to speaker verification systems. [...] Key Method The features are tested under three different training situations of the converted speech detector: a) only Gaussian mixture model (GMM) based converted speech data are available; b) only unit-selection based converted speech data are available; c) no converted speech data are available for training converted speech model. Experiments conducted on the National Institute of Standards and Technology (NIST) 2006 speaker…Expand
Preventing converted speech spoofing attacks in speaker verification
  • M. J. Correia, A. Abad, I. Trancoso
  • Computer Science
  • 2014 37th International Convention on Information and Communication Technology, Electronics and Microelectronics (MIPRO)
  • 2014
TLDR
This paper evaluates the vulnerability of a state-of-the-art SV system against a converted speech spoofing attack, and proposes a new converted speech detector using a compact feature representation and a discriminative modeling approach. Expand
Identification of Reconstructed Speech
TLDR
Extensive experimental results show that the EERs lower than 1% can be achieved by the proposed algorithm in most cases, outperforming the reported state-of-the-art identification techniques. Expand
Spectral Features for Synthetic Speech Detection
TLDR
The results show that the countermeasures based on the proposed features outperform other spectral features for both known and unknown attacks. Expand
Anti-spooofing : Speaker verification vs . voice conversion
Voice conversion (VC) techniques, which modify a speaker’s voice to sound like another’s, present a threat to automatic speaker verification (SV) systems. In this paper, we evaluate the vulnerabilityExpand
Exploiting magnitude and phase spectral information for converted speech detection
TLDR
This work focuses on the development of a new converted speech detector able to robustly address the problem of vulnerable speaker verification systems, and uses a subset of the core task of the NIST SRE2006 corpus as the natural data. Expand
Detecting synthetic speech using long term magnitude and phase information
TLDR
Experimental results show that long term information up to 0.3s is important for synthetic speech detection, and the high dimensional log magnitude spectrum features significantly outperforms the low dimensional MFCC features, showing that it is important to retain the detailed spectral information for detecting synthetic speech. Expand
Synthetic speech detection using temporal modulation feature
TLDR
From the synthetic speech detection results, the modulation features provide complementary information to magnitude/phase features, and the best detection performance is obtained by fusing phase modulation features and phase features, yielding an equal error rate. Expand
A cross-vocoder study of speaker independent synthetic speech detection using phase information
TLDR
The feasibility of a universal synthetic speech detector using a statistical classifier for human and synthetic speech is explored using a variety of state-of-the-art minimum-phase vocoders, creating imposter synthetic signals by copy-synthesis. Expand
Classifiers for synthetic speech detection: a comparison
TLDR
Five different classifiers used in speaker recognition to detect synthetic speech are compared and it is shown that support vector machines with generalized linear discriminant kernel (GLDS-SVM) yield the best performance on the development set with the EER of 0.12 % whereas Gaussian mixture model (GMM) trained using maximum likelihood (ML) criterion is superior for the evaluation set. Expand
Toward a Universal Synthetic Speech Spoofing Detection Using Phase Information
TLDR
This paper presents a synthetic speech detector that can be connected at the front-end or at the back-end of a standard SV system, and that will protect it from spoofing attacks coming from state-of-the-art statistical Text to Speech (TTS) systems. Expand
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 22 REFERENCES
Detection of synthetic speech for the problem of imposture
TLDR
A HMM-based speech synthesizer is used, which creates synthetic speech for a targeted speaker through adaptation of a background model and both GMM-UBM and support vector machine (SVM) SV systems are used, reducing the vulnerability of a speaker verification (SV) system to synthetic speech. Expand
Evaluation of the Vulnerability of Speaker Verification to Synthetic Speech
TLDR
While SV systems have impressive accuracy, even with the proposed detector, high-quality synthetic speech can lead to an unacceptably high acceptance rate of synthetic speakers. Expand
Vulnerability of speaker verification systems against voice conversion spoofing attacks: The case of telephone speech
TLDR
Experiments on a subset of NIST 2006 SRE corpus indicate that the JFA method is most resilient against conversion attacks, but even it experiences more than 5-fold increase in the false acceptance rate. Expand
Spectral voice conversion for text-to-speech synthesis
  • A. Kain, Michael W. Macon
  • Computer Science
  • Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181)
  • 1998
TLDR
A new voice conversion algorithm that modifies a source speaker's speech to sound as if produced by a target speaker is presented and is found to perform more reliably for small training sets than a previous approach. Expand
A robust speaker verification system against imposture using an HMM-based speech synthesis system
TLDR
Experimental results show that thefalse acceptance rate for synthetic speech was reduced drastically without significant increase of the false acceptance and rejection rates for natural speech. Expand
Continuous probabilistic transform for voice conversion
TLDR
The design of a new methodology for representing the relationship between two sets of spectral envelopes and the proposed transform greatly improves the quality and naturalness of the converted speech signals compared with previous proposed conversion methods. Expand
Is voice transformation a threat to speaker identification?
TLDR
Experimental results showed that current standard voice transformation techniques are able to fool the GMM-based system but not the Phonetic speaker identification system, implying that future speaker identification systems should include idiosyncratic knowledge in order to successfully distinguish transformed speech from natural speech and thus be armed against imposter attacks. Expand
Imposture using synthetic speech against speaker verification based on spectrum and pitch
TLDR
Experimental results show that pitch information is not necessarily useful for rejection of synthetic speech, and it is required to develop techniques to discriminate synthetic speech from natural speech. Expand
On the security of HMM-based speaker verification systems against imposture using synthetic speech
TLDR
Experimental results show that false acceptance rates for synthetic speech reached over 70% by training the synthesis system using only 1 sentence from each customer, and current security of HMM-based speaker verification systems against synthetic speech is inadequate. Expand
Significance of the Modified Group Delay Feature in Speech Recognition
TLDR
The group delay function is modified to overcome the short-time spectral structure of speech owing to zeros that are close to the unit circle in the z-plane and also due to pitch periodicity effects and is called the modified group delay feature (MODGDF). Expand
...
1
2
3
...