Audio Impairment Recognition using a Correlation-Based Feature Representation

  title={Audio Impairment Recognition using a Correlation-Based Feature Representation},
  author={Alessandro Ragano and Emmanouil Benetos and Andrew Hines},
  journal={2020 Twelfth International Conference on Quality of Multimedia Experience (QoMEX)},
Audio impairment recognition is based on finding noise in audio files and categorising the impairment type. Recently, significant performance improvement has been obtained thanks to the usage of advanced deep learning models. However, feature robustness is still an unresolved issue and it is one of the main reasons why we need powerful deep learning architectures. In the presence of a variety of musical styles, handcrafted features are less efficient in capturing audio degradation… 

Figures and Tables from this paper

Development of a Speech Quality Database Under Uncontrolled Conditions
A methodology used to curate a speech quality database using the archive recordings from the Apollo Space Program is presented, and the results provide the necessary foundation to support the subsequent development of large-scale crowdsourced datasets for audio quality.
Audio Attacks and Defenses against AED Systems - A Practical Study
The robustness of multiple security critical AED tasks, implemented as CNNs classifiers, as well as existing thirdparty Nest devices, manufactured by Google, which run their own black-box deep learning models are tested.


Supervised Classifiers for Audio Impairments with Noisy Labels
It is demonstrated that CNN can generalize better on the training data with a large number of noisy labels and gives remarkably higher test performance.
Timbre analysis of music audio signals with convolutional neural networks
One of the main goals of this work is to design efficient CNN architectures — what reduces the risk of these models to over-fit, since CNNs' number of parameters is minimized.
Anomalous Sound Detection Using Deep Audio Representation and a BLSTM Network for Audio Surveillance of Roads
A framework based on multiple-stage deep autoencoder network (DAN) to extract the deep audio representation (DAR), which fuses complementary information from several input features and thus can be more discriminative and robust than those input features.
Impulsive Disturbances in Audio Archives: Signal Classification for Automatic Restoration
A new algorithm to classify whether each one-second long frame of an audio recording contains impulsive disturbances or not is presented, based on supervised learning and appropriate prewhitening of the input signal.
Music Feature Maps with Convolutional Neural Networks for Music Genre Classification
With CNNs trained in such a way that filter dimensions are interpretable in time and frequency, results show that only eight music features are more efficient than 513 frequency bins of a spectrogram and that late score fusion between systems based on both feature types reaches 91% accuracy on the GTZAN database.
Improved Music Genre Classification with Convolutional Neural Networks
Two ways to improve music genre classification with convolutional neural networks are proposed, combining maxand averagepooling to provide more statistical information to higher level neural networks and using shortcut connections to skip one or more layers, a method inspired by residual learning method.
Anomaly Detection in Raw Audio Using Deep Autoregressive Networks
  • Ellen Rushe, Brian Mac Namee
  • Computer Science
    ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
  • 2019
This paper proposes to extend autoregressive deep learning architecture to anomaly detection in raw audio and compares the performance of this approach to a baseline autoencoder model and shows superior performance in almost all cases.
A Survey of Audio-Based Music Classification and Annotation
A comprehensive review on audio-based classification in MIR is provided and the difference in the features and the types of classifiers used for different classification tasks are stressed.
Deep Recurrent Neural Network-Based Autoencoders for Acoustic Novelty Detection
How RNN-based autoencoders outperform statistical approaches up to an absolute improvement of 16.4% average F-measure over the three databases is shown.