• Publications
  • Influence
Mixup-Based Acoustic Scene Classification Using Multi-Channel Convolutional Neural Network
TLDR
This paper explores the use of Multi-channel CNN for the classification task, which aims to extract features from different channels in an end-to-end manner, and explores the using of mixup method, which can provide higher prediction accuracy and robustness in contrast with previous models.
Sample Mixed-Based Data Augmentation for Domestic Audio Tagging
TLDR
A convolutional recurrent neural network with attention module with log-scaled mel spectrum as a baseline system is applied to audio tagging, achieving an state-of-the-art of equal error rate (EER) of 0.10 on DCASE 2016 task4 dataset with mixup approach, outperforming the baseline system without data augmentation.
Environmental Sound Classification Based on Multi-temporal Resolution Convolutional Neural Network Combining with Multi-level Features
TLDR
Results demonstrate that the proposed method is highly effective in the classification tasks by employing multi-temporal resolution and multi-level features, and it outperforms the previous methods which only account for single- level features.
Large-Scale Whale-Call Classification by Transfer Learning on Multi-Scale Waveforms and Time-Frequency Features
TLDR
An effective data-driven approach based on pre-trained Convolutional Neural Networks using multi-scale waveforms and time-frequency feature representations is developed in order to perform the classification of whale calls from a large open-source dataset recorded by sensors carried by whales.
The Classification of Underwater Acoustic Targets Based on Deep Learning Methods
TLDR
The results show that deep learning methods can achieve higher recognition accuracy when classifying the underwater targets from their radiation noises.
General audio tagging with ensembling convolutional neural network and statistical features
TLDR
An ensemble learning framework is applied to ensemble statistical features and the outputs from the deep classifiers, with the goal to utilize complementary information to address the noisy label problem within the framework.
Meta learning based audio tagging
TLDR
This paper describes the solution for the general-purpose audio tagging task, which belongs to one of the subtasks in the DCASE 2018 challenge, and proposes a meta learning-based ensemble method that can provide higher prediction accuracy and robustness with comparison to the single model.
Weakly supervised CRNN system for sound event detection with large-scale unlabeled in-domain data
TLDR
A state-of-the-art general audio tagging model is first employed to predict weak labels for unlabeled data, and a weakly supervised architecture based on the convolutional recurrent neural network is developed to solve the strong annotations of sound events with the aid of the unlabeling data with predicted labels.
Large-Scale Whale Call Classification Using Deep Convolutional Neural Network Architectures
TLDR
A comparative performance study of different the-state-of-the-art CNN architectures on a large-scale whale-call classification task is investigated and is found that the advancement of popular CNN architectures significantly improve the accuracy on the whalecall classification task.
...
...