Diversity-Robust Acoustic Feature Signatures Based on Multiscale Fractal Dimension for Similarity Search of Environmental Sounds

@article{Sunouchi2021DiversityRobustAF,
  title={Diversity-Robust Acoustic Feature Signatures Based on Multiscale Fractal Dimension for Similarity Search of Environmental Sounds},
  author={Motohiro Sunouchi and Masaharu Yoshioka},
  journal={IEICE Trans. Inf. Syst.},
  year={2021},
  volume={104-D},
  pages={1734-1748}
}
This paper proposes new acoustic feature signatures based on the multiscale fractal dimension (MFD), which are robust against the diversity of environmental sounds, for the content-based similarity search. The diversity of sound sources and acoustic compositions is a typical feature of environmental sounds. Several acoustic features have been proposed for environmental sounds. Among them is the widely-used Mel-Frequency Cepstral Coefficients (MFCCs), which describes frequency-domain features… 
Proposal of the Aesthetic Experience-Oriented Evaluation Framework for Field-recording Sound Retrieval System: Experiments using Acoustic Feature Signatures Based on Multiscale Fractal Dimension
TLDR
This study proposes an aesthetic experience-oriented evaluation framework for a field-recording sound retrieval system, using the sound clips extracted from Freesound, and discusses the features of the framework by analyzing the performance of the similarity search system for field-Recording sound material using acoustic feature signatures based on the multiscale fractal dimension.

References

SHOWING 1-10 OF 37 REFERENCES
Noise-Robust environmental sound classification method based on combination of ICA and MP features
TLDR
An environmental sound classification method that is noise-robust against sounds recorded by mobile devices, and evaluation of its performance confirmed that the proposed method can provide about 8% better classification than that of MFCC feature extraction.
Environmental Sound Recognition With Time–Frequency Audio Features
TLDR
An empirical feature analysis for audio environment characterization is performed and a matching pursuit algorithm is proposed to use to obtain effective time-frequency features to yield higher recognition accuracy for environmental sounds.
Multiscale Fractal Analysis of Musical Instrument Signals With Application to Recognition
TLDR
TheMultiscale fractal dimension (MFD) profile is proposed as a short-time descriptor, useful to quantify the multiscale complexity and fragmentation of the different states of the music waveform, and can discriminate several aspects among different music instruments.
Musical instruments signal analysis and recognition using fractal features
TLDR
The multi-scale fractal dimension profile is proposed as a descriptor useful to quantify the multiscale complexity of the music waveform and experimentally found that this descriptor can discriminate several aspects among different music instruments.
NMF-based environmental sound source separation using time-variant gain features
Environmental sound recognition: A survey
TLDR
This survey will offer a qualitative and elucidatory survey on recent developments of environmental sound recognition, and includes three parts: i) basic environmental sound processing schemes, ii) stationary ESR techniques and iii) non-stationary E SR techniques.
Fractal dimensions of speech sounds: computation and application to automatic speech recognition.
TLDR
The geometry of speech turbulence as reflected in the fragmentation of the time signal is quantified by using fractal models and an efficient algorithm for estimating the short-time fractal dimension of speech signals based on multiscale morphological filtering is described.
Representing environmental sounds using the separable scattering transform
TLDR
This paper proposes a novel representation of environmental sounds based on the scattering transform which has the property of stability to time-warping deformations and invariance toTime-shift useful for classifications tasks.
Fast query by example of environmental sounds via robust and efficient cluster-based indexing
TLDR
This work explores several cluster-based indexing approaches, namely non-negative matrix factorization (NMF) and spectral clustering to efficiently organize and quickly retrieve archived audio using the QBE paradigm, and initial results indicate significant improvements over both exhaustive search schemes and traditional K- means clustering, and excellent overall performance in the example-based retrieval of environmental sounds.
On feature selection in environmental sound recognition
TLDR
Given a broad set of content-based audio features, principal component analysis is employed for the composition of an optimal feature set for environmental sounds and retrieval results show that statistical data analysis gives useful hints for feature selection.
...
1
2
3
4
...