• Corpus ID: 252917558

Digital Audio Forensics: Blind Human Voice Mimicry Detection

  title={Digital Audio Forensics: Blind Human Voice Mimicry Detection},
  author={Sahar Al Ajmi and Khizar Hayat and Alaa M. Al Obaidi and Naresh Kumar and Munaf Najmuldeen and Baptiste Magnier},
Audio is one of the most used ways of human communication, but at the same time it can be easily misused by to trick people. With the revolution of AI, the related technologies are now accessible to almost everyone thus making it simple for the criminals to commit crimes and forgeries. In this work, we introduce a deep learning method to develop a classifier that will blindly classify an input audio as real or mimicked. The proposed model was trained on a set of important features extracted from… 

Figures and Tables from this paper



Digital audio forensics: a first practical evaluation on microphone and environment classification

The results show, that for the test set, the used classification techniques and selected steganalysis features, microphones can be better classified than environments.

One-Class Learning Towards Synthetic Voice Spoofing Detection

This work proposes an anti-spoofing system to detect unknown synthetic voice spoofing attacks (i.e., text-to-speech or voice conversion) using one-class learning, which achieves an equal error rate (EER) and outperforming all existing single systems.

Digital multimedia audio forensics: past, present and future

An overview of the current state-of-the-art (SOA) in digital audio forensics is provided and some open research problems and future challenges in this active area of research are highlighted.

Prosody based voice forgery detection using SVM

A novel voice forgery detection system based on prosodic features using Support Vector Machines (SVM) is presented, illustrated on a database collected from professional mimicry artists.

Voice Impersonation Using Generative Adversarial Networks

  • Yang GaoRita SinghB. Raj
  • Computer Science
    2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
  • 2018
A novel neural-network based speech quality- and style-mimicry framework for the synthesis of impersonated voices, built upon a fast and accurate generative adversarial network model that reframes the well-known problem of style-transfer for images as the problem for speech signals, while intrinsically addressing the problem of durational variability of speech sounds.

Voice Disguise and Automatic Detection: Review and Perspectives

This study focuses on the question of voice disguise and its detection and describes the main indicators proposed in the literature to distinguish a disguised voice from the original voice, and proposes some perspectives based on disordered and emotional speech.

You Can Hear But You Cannot Steal: Defending Against Voice Impersonation Attacks on Smartphones

  • Si ChenK. Ren Aziz Mohaisen
  • Computer Science
    2017 IEEE 37th International Conference on Distributed Computing Systems (ICDCS)
  • 2017
This paper proposes the design and implementation of a robust software-only voice impersonation defense system, which is tailored for mobile platforms and can be easily integrated with existing off-the-shelf smart devices.

Automatic speaker recognition as a measurement of voice imitation and conversion

The results obtained in the current experiments show that the identification error rate increases when testing with imitated voices, as well as when using converted voices, especially the crossgender ones.

Detection of Speaker Characteristics Using Voice Imitation

This article examines whether three impersonators, two professional and one amateur, selected the same features and speaker characteristics when imitating the same target speakers and whether they achieved similar degrees of success and gives an insight into how difficult it is to focus on only one or two features when trying to identify one speaker from his voice.

I-vectors meet imitators: on vulnerability of speaker verification systems against voice mimicry

This work studies the vulnerability of two well-known speaker recognition systems, traditional Gaussian mixture model – universal background model (GMM-UBM) and a state-of-the-art i-vector classifier with cosine scoring, which consists of one professional Finnish imitator impersonating five wellknown Finnish public figures.