Identifying Optimal Features for Multi-channel Acoustic Scene Classification
@article{Copiaco2019IdentifyingOF, title={Identifying Optimal Features for Multi-channel Acoustic Scene Classification}, author={Abigail Copiaco and Christian Ritz and Nidhal Abdulaziz and Stefano Fasciani}, journal={2019 2nd International Conference on Signal Processing and Information Security (ICSPIS)}, year={2019}, pages={1-4} }
Recent approaches to audio classification are typically developed for single channel recordings of acoustic events. In contrast, approaches to acoustic classification of multichannel recordings of domestic audio have not been thoroughly investigated, especially for household recorded acoustic scenes. In this paper, we consider domestic multi-channel audio classification through the use of a Deep Convolutional Neural Network (DCNN) model. The DCNN is applied to cepstral and spectral-based…Â
4 Citations
A Study of Features and Deep Neural Network Architectures and Hyper-Parameters for Domestic Audio Classification
- Computer ScienceApplied Sciences
- 2021
A detailed study of the most apparent and widely-used cepstral and spectral features for multi-channel audio applications and the use of spectro-temporal features is presented, and the development of a compact version of the AlexNet model for computationally-limited platforms is detailed.
Identifying Sound Source Node Locations Using Neural Networks Trained with Phasograms
- Computer Science2020 IEEE International Symposium on Signal Processing and Information Technology (ISSPIT)
- 2020
This work focuses on the phase component of the STFT coefficients to estimate the sound source location by classifying the closest microphone array (node) by mapping of the phase differences information within the time-frequency domain.
DASEE A Synthetic Database of Domestic Acoustic Scenes and Events in Dementia Patients Environment
- Computer ScienceArXiv
- 2021
This work details its approach on generating an unbiased synthetic domestic audio database, consisting of sound scenes and events, emulated in both quiet and noisy environments, and presents an 11-class database containing excerpts of clean and noisy signals.
An Application for Dementia Patient Monitoring with Sound Level Assessment Tool
- Computer Science2020 3rd International Conference on Signal Processing and Information Security (ICSPIS)
- 2020
This work proposes an application with an intuitive interface that allows the acoustic monitoring of the patient without infringing their privacy and implements a sound level assessment tool, such that the time-average levels of the sound are compared to the recommended levels depending on the specific location and time of the day.
References
SHOWING 1-10 OF 20 REFERENCES
A convolutional neural network approach for acoustic scene classification
- Computer Science2017 International Joint Conference on Neural Networks (IJCNN)
- 2017
This paper proposes the use of a CNN trained to classify short sequences of audio, represented by their log-mel spectrogram, and introduces a training method that can be used under particular circumstances in order to make full use of small datasets.
Acoustic Features for Environmental Sound Analysis
- Computer Science
- 2018
The general processing chain to convert an sound signal to a feature vector that can be efficiently exploited by a classifier and the relation to features used for speech and music processing are described is this chapter.
Power-Normalized Cepstral Coefficients (PNCC) for Robust Speech Recognition
- Computer ScienceIEEE/ACM Transactions on Audio, Speech, and Language Processing
- 2016
Experimental results demonstrate that PNCC processing provides substantial improvements in recognition accuracy compared to MFCC and PLP processing for speech in the presence of various types of additive noise and in reverberant environments, with only slightly greater computational cost than conventional MFCC processing.
Sound Event Detection in Multichannel Audio Using Spatial and Harmonic Features
- Computer Science, PhysicsDCASE
- 2016
The proposed SED system is compared against the state of the art mono channel method on the development subset of TUT sound events detection 2016 database and the usage of spatial and harmonic features are shown to improve the performance of SED.
Deep Convolutional Neural Network with Scalogram for Audio Scene Modeling
- Computer ScienceINTERSPEECH
- 2018
An approach to learning audio scene pat-terns from scalogram, which is extracted from raw signal with simple wavelet transforms is proposed, which showed that multi-scale feature led to an obvious accuracy increase.
Design, analysis and experimental evaluation of block based transformation in MFCC computation for speaker recognition
- Computer ScienceSpeech Commun.
- 2012
Audio source separation with time-frequency velocities
- Physics2014 IEEE International Workshop on Machine Learning for Signal Processing (MLSP)
- 2014
A new approach is introduced, which relies on the time dynamics of rigid audio models, based on harmonic templates, which provides piecewise constant velocity approximations for blind source separation from single channel audio signals.
Feature extraction for robust speech recognition using a power-law nonlinearity and power-bias subtraction
- EngineeringINTERSPEECH
- 2009
Experimental results demonstrate that the PNCC processing provides substantial improvements in recognition accuracy compared to MFCC and PLP processing for various types of additive noise.
The SINS Database for Detection of Daily Activities in a Home Environment Using an Acoustic Sensor Network
- Computer ScienceDCASE
- 2017
A database recorded in one living home, over a period of one week, containing activities being performed in a spontaneous manner, which make use of an acoustic sensor network, and are recorded as a continuous stream is introduced.
DOMESTIC ACTIVITIES CLASSIFICATION BASED ON CNN USING SHUFFLING AND MIXING DATA AUGMENTATION Technical Report
- Computer Science
- 2018
This technical report describes the proposed design and implementation of the system used for the DCASE 2018 Challenge submission, and proposes data augmentation techniques using shuffling and mixing two sounds in a same class to mitigate the unbalanced training dataset.