• Corpus ID: 52040492

Unsupervised adversarial domain adaptation for acoustic scene classification

@inproceedings{Gharib2018UnsupervisedAD,
  title={Unsupervised adversarial domain adaptation for acoustic scene classification},
  author={Shayan Gharib and Konstantinos Drossos and Emre Çakir and Dmitriy Serdyuk and Tuomas Virtanen},
  booktitle={DCASE},
  year={2018}
}
A general problem in acoustic scene classification task is the mismatched conditions between training and testing data, which significantly reduces the performance of the developed methods on classification accuracy. [] Key Method We employ a model pre-trained on data from one set of conditions and by using data from other set of conditions, we adapt the model in order that its output cannot be used for classifying the set of conditions that input data belong to.

Figures and Tables from this paper

Adversarial Domain Adaptation with Paired Examples for Acoustic Scene Classification on Different Recording Devices
TLDR
This paper investigates several adversarial models for domain adaptation (DA) and their effect on the acoustic scene classification task, and finds that the best performing domain adaptation can be obtained using the cycle GAN, which achieves as much as 66% relative improvement in accuracy for the target domain device, while only 6 % relative decrease on the source domain.
Unsupervised Adversarial Domain Adaptation Based on The Wasserstein Distance For Acoustic Scene Classification
TLDR
This paper builds upon the theoretical model of ℋΔℋ-distance and previous adversarial discriminative deep learning method for ASC unsupervised domain adaptation, and presents an adversarial training based method using the Wasserstein distance.
Unsupervised Domain Adaptation for Acoustic Scene Classification Using Band-Wise Statistics Matching
TLDR
An unsupervised domain adaptation method that consists of aligning the first- and second-order sample statistics of each frequency band of target-domain acoustic scenes to the ones of the source-domain training dataset is proposed to adapt audio samples from unseen devices before they are fed to a pre-trained classifier, thus avoiding any further learning phase.
Ensemble of Discriminators for Domain Adaptation in Multiple Sound Source 2D Localization
TLDR
An ensemble of discriminators that improves the accuracy of a domain adaptation technique for the localization of multiple sound sources by combining discriminators applied at different feature levels of the localization model is introduced.
Towards Audio Domain Adaptation for Acoustic Scene Classification using Disentanglement Learning
TLDR
A novel domain adaptation strategy based on disentanglement learning is proposed to disentangle task-specific and domain-specific characteristics in the analyzed audio recordings and a novel combination of categorical cross-entropy and variance-based losses is suggested.
Unsupervised Multi-Target Domain Adaptation for Acoustic Scene Classification
TLDR
A novel unsupervised multi-target domain adaption (MTDA) method for ASC, which can adapt to multiple target domains simultaneously and make use of the underlying relation among multiple domains.
Feature Projection-Based Unsupervised Domain Adaptation for Acoustic Scene Classification
TLDR
This work proposes an unsupervised domain adaptation method for ASC based on the projection of spectro-temporal features extracted from both the source and target domain onto the principal subspace spanned by the eigenvectors of the sample covariance matrix of source-domain training data.
Adversarial Unsupervised Domain Adaptation for Harmonic-Percussive Source Separation
TLDR
This letter proposes an adversarial unsupervised domain adaptation approach suitable for the case where no labelled data (ground-truth source signals) from a target domain is available, and introduces the Tap & Fiddle dataset, a dataset containing recordings of Scandinavian fiddle tunes along with isolated tracks for “foot-tapping” and “violin”.
Capturing Discriminative Information Using a Deep Architecture in Acoustic Scene Classification
TLDR
A max feature map method is adopted that replaces conventional non-linear activation functions in deep neural networks and applies an element-wise comparison between the different filters of a convolution layer’s output to improve the generalization ability.
Acoustic Scene Classification for Mismatched Recording Devices Using Heated-Up Softmax and Spectrum Correction
TLDR
This paper applies scaling of the features to deal with varying frequency response of the recording devices, and a heated-up softmax is embedded to calibrate the predictions of the model to account for the shifted data distribution.
...
...

References

SHOWING 1-10 OF 21 REFERENCES
CLASSIFYING SHORT ACOUSTIC SCENES WITH I-VECTORS AND CNNS : CHALLENGES AND OPTIMISATIONS FOR THE 2017 DCASE ASC TASK
TLDR
The result of the CP-JKU team’s experiments is a classification system that achieves classification accuracies of around 90% on the provided development data, as estimated via the prescribed four-fold cross-validation scheme.
A multi-device dataset for urban acoustic scene classification
TLDR
The acoustic scene classification task of DCASE 2018 Challenge and the TUT Urban Acoustic Scenes 2018 dataset provided for the task are introduced, and the performance of a baseline system in the task is evaluated.
DCASE 2016 Acoustic Scene Classification Using Convolutional Neural Networks
TLDR
This workshop paper presents the use of a convolutional neural network trained to classify short sequences of audio, represented by their log-mel spectrogram, and proposes a training method that can be used when the system validation performance saturates as the training proceeds.
Unsupervised Domain Adaptation by Backpropagation
TLDR
The method performs very well in a series of image classification experiments, achieving adaptation effect in the presence of big domain shifts and outperforming previous state-of-the-art on Office datasets.
Adversarial Discriminative Domain Adaptation
TLDR
It is shown that ADDA is more effective yet considerably simpler than competing domain-adversarial methods, and the promise of the approach is demonstrated by exceeding state-of-the-art unsupervised adaptation results on standard domain adaptation tasks as well as a difficult cross-modality object classification task.
Beyond Sharing Weights for Deep Domain Adaptation
TLDR
This work introduces a two-stream architecture, where one operates in the source domain and the other in the target domain, and demonstrates that this both yields higher accuracy than state-of-the-art methods on several object recognition and detection tasks and consistently outperforms networks with shared weights in both supervised and unsupervised settings.
Acoustic Scene Classification Based on Convolutional Neural Network Using Double Image Features
TLDR
New image features for the acoustic scene classification task of the IEEE AASP Challenge: Detection and Classification of Acoustic Scenes and Events are proposed and it is claimed that the proposed method outperformed several baseline methods.
Domain Adaptation with Adversarial Training and Graph Embeddings
TLDR
A novel model is proposed that performs adversarial learning based domain adaptation to deal with distribution drifts and graph based semi-supervised learning to leverage unlabeled data within a single unified deep learning framework.
Marginalized Denoising Autoencoders for Domain Adaptation
TLDR
The approach of mSDA marginalizes noise and thus does not require stochastic gradient descent or other optimization algorithms to learn parameters--in fact, they are computed in closed-form, significantly speeds up SDAs by two orders of magnitude.
Domain Separation Networks
TLDR
The novel architecture results in a model that outperforms the state-of-the-art on a range of unsupervised domain adaptation scenarios and additionally produces visualizations of the private and shared representations enabling interpretation of the domain adaptation process.
...
...