Detection and classification of acoustic scenes and events: An IEEE AASP challenge

@article{Giannoulis2013DetectionAC,
  title={Detection and classification of acoustic scenes and events: An IEEE AASP challenge},
  author={Dimitrios Giannoulis and Emmanouil Benetos and Dan Stowell and Mathias Rossignol and Mathieu Lagrange and Mark D. Plumbley},
  journal={2013 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics},
  year={2013},
  pages={1-4}
}
This paper describes a newly-launched public evaluation challenge on acoustic scene classification and detection of sound events within a scene. Systems dealing with such tasks are far from exhibiting human-like performance and robustness. Undermining factors are numerous: the extreme variability of sources of interest possibly interfering, the presence of complex background noise as well as room effects like reverberation. The proposed challenge is an attempt to help the research community… 

Figures and Tables from this paper

A database and challenge for acoustic scene classification and event detection
TLDR
This paper introduces a newly-launched public evaluation challenge dealing with two closely related tasks of the field: acoustic scene classification and event detection.
Acoustic Scene Classification: Classifying environments from the sounds they produce
TLDR
An account of the state of the art in acoustic scene classification (ASC), the task of classifying environments from the sounds they produce, and a range of different algorithms submitted for a data challenge to provide a general and fair benchmark for ASC techniques.
IEEE AASP Challenge on Detection and Classification of Acoustic Scenes and Events SOUND EVENT DETECTION FOR OFFICE LIVE AND OFFICE SYNTHETIC AASP CHALLENGE
We present a sound event detection system based on hidden Markov models. The system is evaluated with development material provided in the AASP Challenge on Detection and Classification of Acoustic
The effect of room acoustics on audio event classification
TLDR
The impact of mismatches between training and testing conditions in terms of acoustical parameters, including the reverberation time (T60) and the direct-to-reverberant ratio (DRR), on audio classification accuracy and class separability is studied.
An evaluation framework for event detection using a morphological model of acoustic scenes
TLDR
A model of environmental acoustic scenes which adopts a morphological approach by ab-stracting temporal structures of acoustic scenes is introduced, able to explicitly control key morphological aspects of the acoustic scene and isolate their impact on the performance of the system under evaluation.
Detection and Classification of Acoustic Scenes and Events
TLDR
The state of the art in automatically classifying audio scenes, and automatically detecting and classifyingaudio events is reported on.
Model-based processing for acoustic scene analysis
TLDR
This paper intends to illustrate the point that the usual model-based approach employed for sound recognition or detection can be extended to other co-occurrent tasks like source localization, so both tasks can be carried out jointly, using the same formulation and processing.
Reverberation-based feature extraction for acoustic scene classification
TLDR
A strong low-complexity baseline system using a compact feature set that is improved with a novel class of audio features, which exploit the knowledge of sound behaviour within the scene - reverberation, which increases the classification accuracy.
A Morphological Model for Simulating Acoustic Scenes and Its Application to Sound Event Detection
This paper introduces a model for simulating environmental acoustic scenes that abstracts temporal structures from audio recordings. This model allows us to explicitly control key morphological
Joint model-based recognition and localization of overlapped acoustic events using a set of distributed small microphone arrays
TLDR
A model-based approach to jointly carry them out for the case of multiple simultaneous sources is presented and tested and shows the advantage of the proposed approach with respect to some usual techniques, and that the inclusion of estimated priors brings a further performance improvement.
...
...

References

SHOWING 1-10 OF 16 REFERENCES
A database and challenge for acoustic scene classification and event detection
TLDR
This paper introduces a newly-launched public evaluation challenge dealing with two closely related tasks of the field: acoustic scene classification and event detection.
The PASCAL CHiME speech separation and recognition challenge
The CLEAR 2006 Evaluation
TLDR
The evaluation tasks in CLEAR 2006 included person tracking, face detection and tracking, person identification, head pose estimation, vehicle tracking as well as acoustic scene analysis and an overview of the results.
The 2011 Signal Separation Evaluation Campaign (SiSEC2011): - Audio Source Separation -
TLDR
This paper summarizes the audio part of the 2011 community-based Signal Separation Evaluation Campaign (SiSEC2011), including datasets recorded in noisy or dynamic environments and a subset of the SiSEC2010 datasets.
The bag-of-frames approach to audio pattern recognition: a sufficient model for urban soundscapes but not for polyphonic music.
TLDR
This paper proposes to explicitly examine the difference between urban soundscapes and polyphonic music with respect to their modeling with the BOF approach, and reveals critical differences in the temporal and statistical structure of the typical frame distribution of each type of signal.
Multimodal Technologies for Perception of Humans, First International Evaluation Workshop on Classification of Events, Activities and Relationships, CLEAR 2006, Southampton, UK, April 6-7, 2006, Revised Selected Papers
TLDR
2D Multi-person Tracking: A Comparative Study in AMI Meetings and Head Pose Tracking and Focus of Attention Recognition Algorithms in Meeting Rooms.
TRECVID 2015 - An Overview of the Goals, Tasks, Data, Evaluation Mechanisms and Metrics
The TREC Video Retrieval Evaluation (TRECVID) 2011 was a TREC-style video analysis and retrieval evaluation, the goal of which remains to promote progress in content-based exploitation of digital
Computational Auditory Scene Analysis: Principles, Algorithms, and Applications
TLDR
This paper focuses on the development of model-Based Speech Segregation in CASA systems, which was first introduced in 2000 and has since been upgraded to a full-blown model-based system.
Reproducible research in signal processing
TLDR
If the experiments are performed on a large data set, the algorithm is compared to the state-of-the-art methods, the code and/or data are well documented and available online, the community will all benefit and make it easier to build upon each other's work.
How to Encourage and Publish Reproducible Research
  • J. Kovacevic
  • Geology
    2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07
  • 2007
I discuss the "what", "why" and "how" of reproducible research, a concept that emerged recently in computational sciences. It refers to the idea that the ultimate product is not a published paper
...
...