From Environmental Sound Representation to Robustness of 2D CNN Models Against Adversarial Attacks

  title={From Environmental Sound Representation to Robustness of 2D CNN Models Against Adversarial Attacks},
  author={Mohammad Esmaeilpour and Patrick Cardinal and Alessandro Lameiras Koerich},
This paper investigates the impact of different standard environmental sound representations (spectrograms) on the recognition performance and adversarial attack robustness of a victim residual convolutional neural network, namely ResNet-18. Our main motivation for focusing on such a front-end classifier rather than other complex architectures is balancing recognition accuracy and the total number of training parameters. Herein, we measure the impact of different settings required for… 

RSD-GAN: Regularized Sobolev Defense GAN Against Speech-to-Text Adversarial Attacks

Results upon carrying out numerous experiments on the victim DeepSpeech, Kaldi, and Lingvo speech transcription systems corroborate the remarkable performance of the defense approach against a comprehensive range of targeted and non-targeted adversarial attacks.

Environmental Sound Classification using Hybrid Ensemble Model



A Robust Approach for Securing Audio Classification Against Adversarial Attacks

A novel approach based on pre-processed DWT representation of audio signals and SVM to secure audio systems against adversarial attacks and shows competitive performance compared to the deep neural networks both in terms of accuracy and robustness against strong adversarial attack.

Adversarially Training for Audio Classifiers

The ResNet-56 model trained on the 2D representation of the discrete wavelet transform appended with the tonnetz chromagram outperforms other models in terms of recognition accuracy, and the positive impact of adversarially training on this model as well as other deep architectures against six types of attack algorithms with the cost of the reduced recognition accuracy and limited adversarial perturbation.

Cross-Representation Transferability of Adversarial Attacks: From Spectrograms to Audio Waveforms

This paper shows the susceptibility of spectrogram-based audio classifiers to adversarial attacks and the transferability of such attacks to audio waveforms, and how such attacks produce perturbed spectrograms that are visually imperceptible by humans.

Deep Convolutional Neural Networks and Data Augmentation for Environmental Sound Classification

It is shown that the improved performance stems from the combination of a deep, high-capacity model and an augmented training set: this combination outperforms both the proposed CNN without augmentation and a “shallow” dictionary learning model with augmentation.

Towards Evaluating the Robustness of Neural Networks

It is demonstrated that defensive distillation does not significantly increase the robustness of neural networks, and three new attack algorithms are introduced that are successful on both distilled and undistilled neural networks with 100% probability are introduced.

Adversarial Defense via Learning to Generate Diverse Attacks

This work proposes a recursive and stochastic generator that produces much stronger and diverse perturbations that comprehensively reveal the vulnerability of the target classifier.

Mic2Mic: using cycle-consistent generative adversarial networks to overcome microphone variability in speech systems

This work proposes Mic2Mic - a machine-learned system component - which resides in the inference pipeline of audio models and at real-time reduces the variability in audio data caused by microphone-specific factors.

Adversarial Attacks and Defenses Against Deep Neural Networks: A Survey