• Corpus ID: 67856194

The NIGENS General Sound Events Database

@article{Trowitzsch2019TheNG,
  title={The NIGENS General Sound Events Database},
  author={Ivo Trowitzsch and Jalil Taghia and Youssef Kashef and Klaus Obermayer},
  journal={ArXiv},
  year={2019},
  volume={abs/1902.08314}
}
Computational auditory scene analysis is gaining interest in the last years. Trailing behind the more mature field of speech recognition, it is particularly general sound event detection that is attracting increasing attention. Crucial for training and testing reasonable models is having available enough suitable data -- until recently, general sound event databases were hardly found. We release and present a database with 714 wav files containing isolated high quality sound events of 14… 

Figures and Tables from this paper

Robust sound event detection in binaural computational auditory scene analysis
TLDR
A method for joining sound event detection and source localization is presented by which coherent auditory objects can be created and it is shown that algorithms able to model context over longer durations benefit particularly in demanding scenes and get more precise in their detection.
A Dataset of Dynamic Reverberant Sound Scenes with Directional Interferers for Sound Event Localization and Detection
TLDR
To investigate the individual and combined effects of ambient noise, interferers, and reverberation, the performance of the baseline on different versions of the dataset excluding or including combinations of these factors indicates that by far the most detrimental effects are caused by directional interferers.
A SEQUENTIAL SYSTEM FOR SOUND EVENT DETECTION AND LOCALIZATION USING CRNN Technical Report
TLDR
This method uses a CRNN SELDnet-like single output models which run on the features extracted from audio files using log-mel spectrogram to predict sound event classes: Sound Event Detection (SED) and giving the output of SED to estimate Direction Of Arrival (DOA) for those sound events.
What Makes Sound Event Localization and Detection Difficult? Insights from Error Analysis
TLDR
Experimental results indicate polyphony as the main challenge in SELD, due to the difficulty in detecting all sound events of interest, and the SELD systems tend to make fewer errors for the polyphonic scenario that is dominant in the training set.
Joining Sound Event Detection and Localization Through Spatial Segregation
TLDR
This article presents an approach that robustly binds localization with the detection of sound events in a binaural robotic system and demonstrates that the proposed approach is an effective method to obtain joint sound event location and type information under a wide range of conditions.
Sound Event Localization and Detection Based on Adaptive Hybrid Convolution and Multi-scale Feature Extractor
TLDR
A method based on Adaptive Hybrid Convolution (AHConv) and multi-scale feature extractor to capture the dependencies along with the time dimension and the frequency dimension respectively and an adaptive attention block that can integrate information from very local to exponentially enlarged receptive field within the block is proposed.
Audio-Based Aircraft Detection System for Safe RPAS BVLOS Operations
TLDR
An audio-based "Detect and Avoid" system, composed of microphones and an embedded computer, which performs real-time inferences using a sound event detection (SED) deep learning model is proposed.
A Four-Stage Data Augmentation Approach to ResNet-Conformer Based Acoustic Modeling for Sound Event Localization and Detection
TLDR
A novel four-stage data augmentation approach to ResNet-Conformer based acoustic modeling for sound event localization and detection (SELD) that employs a ResNetConformer architecture to model both global and local context dependencies of an audio sequence to yield further gains over those architectures used in the DCASE 2020 SELD evaluations.
Papafil: A Low Complexity Sound Event Localization and Detection Method with Parametric Particle Filtering and Gradient Boosting
The present technical report describes the architecture of the system submitted to the DCASE 2020 Challenge Task 3: Sound Event Localization and Detection. The proposed method conforms a low
Deep learning based cough detection camera using enhanced features
TLDR
To detect and localize coughing sounds remotely, a convolutional neural network (CNN) based deep learning model was developed in this work and integrated with a sound camera for the visualization of the cough sounds.
...
1
2
...

References

SHOWING 1-10 OF 17 REFERENCES
Robust Detection of Environmental Sounds in Binaural Auditory Scenes
TLDR
It is demonstrated that by superimposing target sounds with strongly varying general environmental sounds during training, sound type classifiers are less affected by the presence of a distractor source and generalization performance of such models depends on how similar the angular source configuration and the signal-to-noise ratio are to the conditions under which the models were trained.
ESC: Dataset for Environmental Sound Classification
TLDR
A new annotated collection of 2000 short clips comprising 50 classes of various common sound events, and an abundant unified compilation of 250000 unlabeled auditory excerpts extracted from recordings available through the Freesound project are presented.
Detection and Classification of Acoustic Scenes and Events: Outcome of the DCASE 2016 Challenge
TLDR
The emergence of deep learning as the most popular classification method is observed, replacing the traditional approaches based on Gaussian mixture models and support vector machines.
Joining Sound Event Detection and Localization Through Spatial Segregation
TLDR
This article presents an approach that robustly binds localization with the detection of sound events in a binaural robotic system and demonstrates that the proposed approach is an effective method to obtain joint sound event location and type information under a wide range of conditions.
Auditory Machine Learning Training and Testing Pipeline: AMLTTP v3.0
TLDR
The purpose of the Two!Ears Auditory Machine Learning Training and Testing Pipeline (AMLTTP) is to build and evaluate models for auditory sound object annotation and assigning attributes to them by inductive learning from labeled training data.
An audio-visual corpus for speech perception and automatic speech recognition.
TLDR
An audio-visual corpus that consists of high-quality audio and video recordings of 1000 sentences spoken by each of 34 talkers to support the use of common material in speech perception and automatic speech recognition studies.
DCASE2017 Challenge Setup: Tasks, Datasets and Baseline System
TLDR
This paper presents the setup of these tasks: task definition, dataset, experimental setup, and baseline system results on the development dataset.
A Dataset and Taxonomy for Urban Sound Research
TLDR
A taxonomy of urban sounds and a new dataset, UrbanSound, containing 27 hours of audio with 18.5 hours of annotated sound event occurrences across 10 sound classes are presented.
Freesound technical demo
TLDR
This demo wants to introduce Freesound to the multimedia community and show its potential as a research resource.
Two!Ears – Integral interactive model of auditory perception and experience
A. Raake, J. Blauert, J. Braasch, G. Brown, P. Danès, T. Dau, B. Gas, S. Argentieri, A. Kohlrausch, D. Kolossa, N. Le Goff, T. May, K. Obermayer C. Schymura, T. Walther, H. Wierstorf, F. Winter , S.
...
1
2
...