RawBoost: A Raw Data Boosting and Augmentation Method applied to Automatic Speaker Verification Anti-Spoofing

@article{Tak2021RawBoostAR,
  title={RawBoost: A Raw Data Boosting and Augmentation Method applied to Automatic Speaker Verification Anti-Spoofing},
  author={Hemlata Tak and Madhu R. Kamble and Jose Patino and Massimiliano Todisco and Nicholas W. D. Evans},
  journal={ArXiv},
  year={2021},
  volume={abs/2111.04433}
}
This paper introduces RawBoost, a data boosting and augmentation method for the design of more reliable spoofing detection solutions which operate directly upon raw waveform inputs. While RawBoost requires no additional data sources, e.g. noise recordings or impulse responses and is data, application and model agnostic, it is designed for telephony scenarios. Based upon the combination of linear and non-linear convolutive noise, impulsive signal-dependent additive noise and stationary signal… 

Figures and Tables from this paper

Automatic speaker verification spoofing and deepfake detection using wav2vec 2.0 and data augmentation
TLDR
This paper reports on efforts to use self-supervised learning in the form of a wav2vec 2.0 front-end with fine tuning to obtain the lowest equal error rates reported in the literature for both the ASVspoof 2021 Logical Access and Deepfake databases.
Investigating Active-learning-based Training Data Selection for Speech Spoofing Countermeasure
TLDR
Compared with a top-line CM that simply used the whole data pool set for training, the AL-based CMs achieved similar performance using less training data, and no single best configuration was found for AL.

References

SHOWING 1-10 OF 42 REFERENCES
SpecAugment: A Simple Data Augmentation Method for Automatic Speech Recognition
TLDR
This work presents SpecAugment, a simple data augmentation method for speech recognition that is applied directly to the feature inputs of a neural network (i.e., filter bank coefficients) and achieves state-of-the-art performance on the LibriSpeech 960h and Swichboard 300h tasks, outperforming all prior work.
Data Augmenting Contrastive Learning of Speech Representations in the Time Domain
TLDR
WavAugment is intro-duce, a time-domain data augmentation library which is adapt and optimize for the specificities of CPC (raw waveform input, contrastive loss, past versus future structure), and finds that applying augmentation only to the segments from which the CPC prediction is performed yields better results.
End-to-End anti-spoofing with RawNet2
TLDR
Modifications made to the original RawNet2 architecture are described so that it can be applied to anti-spoofing and these results are reproducible with open source software.
ASVspoof 2021: Automatic Speaker Verification Spoofing and Countermeasures Challenge Evaluation Plan
TLDR
The task is to develop a bona fide spoofed classifier (spoofing countermeasure) for speech data to rank and analyse the results, and present a summary at an INTERSPEECH 2021 satellite workshop.
ASVspoof 2021: accelerating progress in spoofed and deepfake speech detection
  • J. Yamagishi, Xin Wang, H. Delgado
  • Computer Science
    2021 Edition of the Automatic Speaker Verification and Spoofing Countermeasures Challenge
  • 2021
TLDR
Results for the physical access task show the difficulty in detecting attacks in real, variable physical spaces, and the introduction of channel and compression variability which compound the difficulty.
An Empirical Study on Channel Effects for Synthetic Voice Spoofing Countermeasure Systems
TLDR
This paper conducts a cross-dataset study on several state-of-the-art CM systems and observes significant performance degradation compared with their singledataset performance, and hypothesizes that channel mismatch among these datasets is one important reason.
CRIM's System Description for the ASVSpoof2021 Challenge
TLDR
The results show that using codec augmentation, activation ensemble and HOSP technique can help the system to be more robust against trials with adversarial conditions, and further improvement could be made by performing score-level fusion among different systems.
Data Augmentation with Signal Companding for Detection of Logical Access Attacks
TLDR
It is found that the proposed data augmentation technique based on signal companding outperforms the state-of-the-art spoofing countermeasures showing ability to handle unknown nature of attacks.
End-to-End Spectro-Temporal Graph Attention Networks for Speaker Verification Anti-Spoofing and Speech Deepfake Detection
TLDR
It is shown that better performance can be achieved when the fusion is performed within the model itself and when the representation is learned automatically from raw waveform inputs.
Known-unknown Data Augmentation Strategies for Detection of Logical Access, Physical Access and Speech Deepfake Attacks: ASVspoof 2021
  • Rohan Kumar Das
  • Computer Science
    2021 Edition of the Automatic Speaker Verification and Spoofing Countermeasures Challenge
  • 2021
TLDR
This work considers a few data augmentation methods to have a robust spoofing countermeasure based on the known information from the challenge evaluation protocol and with some unknown approaches that can be useful for each of the three tracks.
...
1
2
3
4
5
...