Characterisation of Acoustic Scenes Using a Temporally-constrained Shift-invariant Model


In this paper, we propose a method for modeling and classifying acoustic scenes using temporally-constrained shift-invariant probabilistic latent component analysis (SIPLCA). SIPLCA can be used for extracting time-frequency patches from spectrograms in an unsupervised manner. Component-wise hidden Markov models are incorporated to the SIPLCA formulation for enforcing temporal constraints on the activation of each acoustic component. The time-frequency patches are converted to cepstral coefficients in order to provide a compact representation of acoustic events within a scene. Experiments are made using a corpus of train station recordings, classified into 6 scene classes. Results show that the proposed model is able to model salient events within a scene and outperforms the non-negative matrix factorization algorithm for the same task. In addition, it is demonstrated that the use of temporal constraints can lead to improved performance.

10 Figures and Tables

Showing 1-10 of 23 references

Constant-Q transform toolbox for music processing

  • C Schörkhuber, A Klapuri
  • 2010
Highly Influential
1 Excerpt
Showing 1-10 of 12 extracted citations