• Publications
  • Influence
PANNs: Large-Scale Pretrained Audio Neural Networks for Audio Pattern Recognition
TLDR
This paper proposes pretrained audio neural networks (PANNs) trained on the large-scale AudioSet dataset, and investigates the performance and computational complexity of PANNs modeled by a variety of convolutional neural networks.
Large-Scale Weakly Supervised Audio Classification Using Gated Convolutional Neural Network
In this paper, we present a gated convolutional neural network and a temporal attention-based localization method for audio classification, which won the 1st place in the large-scale weakly
Polyphonic Sound Event Detection and Localization using a Two-Stage Strategy
TLDR
Experimental results show that the proposed two-stage polyphonic sound event detection and localization method is able to improve the performance of both SED and DOAE, and also performs significantly better than the baseline method.
Cross-task learning for audio tagging, sound event detection and spatial localization: DCASE 2019 baseline systems
TLDR
This paper proposes generic cross-task baseline systems based on convolutional neural networks (CNNs) and finds that the 9-layer CNN with average pooling is a good model for a majority of the DCASE 2019 tasks.
Simultaneous Codeword Optimization (SimCO) for Dictionary Update and Learning
TLDR
This work proposes a novel framework where an arbitrary set of codewords and the corresponding sparse coefficients are simultaneously updated, hence the term simultaneous codeword optimization (SimCO).
Audio Set Classification with Attention Model: A Probabilistic Perspective
This paper investigates the Audio Set classification. Audio Set is a large scale weakly labelled dataset (WLD) of audio clips. In WLD only the presence of a label is known, without knowing the
Audio Assisted Robust Visual Tracking With Adaptive Particle Filtering
TLDR
An algorithm which adapts both the number of particles and noise variance based on tracking error and the area occupied by the particles in the image is designed, which is improved by solving a typical problem associated with the PF.
Deep Neural Network Baseline for DCASE Challenge 2016
TLDR
The DCASE Challenge 2016 contains tasks for Acoustic Scene Classification (ASC), Acoustic Event Detection (AED), and audio tagging, and DNN baselines indicate that DNNs can be successful in many of these tasks, but may not always perform better than the baselines.
Sound Event Detection and Time–Frequency Segmentation from Weakly Labelled Data
TLDR
A time–frequency (T–F) segmentation framework trained on weakly labelled data to tackle the sound event detection and separation problem is proposed and predicted onset and offset times can be obtained from the T–F segmentation masks.
Weakly Labelled AudioSet Tagging With Attention Neural Networks
TLDR
This work bridges the connection between attention neural networks and multiple instance learning (MIL) methods, and proposes decision-level and feature-level attention neural Networks for audio tagging, which achieves a state-of-the-art mean average precision.
...
...