Corpus ID: 236493636

Proposal-based Few-shot Sound Event Detection for Speech and Environmental Sounds with Perceivers

@article{Wolters2021ProposalbasedFS,
  title={Proposal-based Few-shot Sound Event Detection for Speech and Environmental Sounds with Perceivers},
  author={Piper Wolters and Chris Daw and Brian Hutchinson and Lauren A. Phillips},
  journal={ArXiv},
  year={2021},
  volume={abs/2107.13616}
}
There are many important applications for detecting and localizing specific sound events within long, untrimmed documents including keyword spotting, medical observation, and bioacoustic monitoring for conservation. Deep learning techniques often set the state-of-the-art for these tasks. However, for some types of events, there is insufficient labeled data to train deep learning models. In this paper, we propose novel approaches to few-shot sound event detection utilizing region proposals and… Expand

Figures and Tables from this paper

References

SHOWING 1-10 OF 45 REFERENCES
Metric Learning with Background Noise Class for Few-Shot Detection of Rare Sound Events
TLDR
This paper aims to achieve few-shot detection of rare sound events, from query sequence that contain not only the target events but also the other events and background noise, and proposes metric learning with background noise class for the few- shot detection. Expand
Few-Shot Sound Event Detection
TLDR
This work adapts state-of-the-art metric-based few-shot learning methods to automate the detection of similar-sounding events, requiring only one or few examples of the target event, and develops a method to automatically construct a partial set of labeled examples to reduce user labeling effort. Expand
BIDIRECTIONAL GRU FOR SOUND EVENT DETECTION
Sound event detection (SED) aims to detect temporal boundaries of sound events from acoustic recordings. Sound events in real-life recordings often overlap with each other (i.e., polyphonic), makingExpand
Learning to Match Transient Sound Events Using Attentional Similarity for Few-shot Sound Recognition
TLDR
The proposed attentional similarity module can be plugged into any metric-based learning method for few-shot learning, allowing the resulting model to especially match related short sound events. Expand
Rare Sound Event Detection Using 1D Convolutional Recurrent Neural Networks
TLDR
The proposed system using combination of 1D convolutional neural network and recurrent neural network (RNN) with long shortterm memory units (LSTM) has achieved the 1st place in the challenge with an error rate of 0.13 and an F-Score of 93.1. Expand
A Region Based Attention Method for Weakly Supervised Sound Event Detection and Classification
TLDR
A novel region based attention method is proposed to further boost the representation power of the existing GLU based CRNN, which extracts region features from multi-scale sliding windows over higher convolutional layers, which are fed into an attention-based recurrent neural network. Expand
Few-Shot Acoustic Event Detection Via Meta Learning
TLDR
This paper formulate few-shot AED problem and explores different ways of utilizing traditional supervised methods for this setting as well as a variety of meta-learning approaches, which are conventionally used to solve few- shot classification problem. Expand
CONFORMER-BASED SOUND EVENT DETECTION WITH SEMI-SUPERVISED LEARNING AND DATA AUGMENTATION
This paper presents a Conformer-based sound event detection (SED) method, which uses semi-supervised learning and data augmentation. The proposed method employs Conformer, a convolution-augmentedExpand
Revisiting Few-shot Activity Detection with Class Similarity Control
TLDR
This paper presents a conceptually simple and general yet novel framework for few-shot temporal activity detection based on proposal regression which detects the start and end time of the activities in untrimmed videos. Expand
ZSTAD: Zero-Shot Temporal Activity Detection
TLDR
This work designs an end-to-end deep network based on R-C3D that is optimized with an innovative loss function that considers the embeddings of activity labels and their super-classes while learning the common semantics of seen and unseen activities. Expand
...
1
2
3
4
5
...