Characterisation of Acoustic Scenes Using a Temporally-constrained Shift-invariant Model
In this paper, we propose a method for modeling and classifying acoustic scenes using temporally-constrained shift-invariant probabilistic latent component analysis (SIPLCA). SIPLCA can be used for extracting time-frequency patches from spectrograms in an unsupervised manner. Component-wise hidden Markov models are incorporated to the SIPLCA formulation for enforcing temporal constraints on the activation of each acoustic component. The time-frequency patches are converted to cepstral coefficients in order to provide a compact representation of acoustic events within a scene. Experiments are made using a corpus of train station recordings, classified into 6 scene classes. Results show that the proposed model is able to model salient events within a scene and outperforms the non-negative matrix factorization algorithm for the same task. In addition, it is demonstrated that the use of temporal constraints can lead to improved performance.