- Published 2008

Bayesian statistical methods provide a formalism for arriving at solutions to various problems faced in audio processing. In real environments, acoustical conditions and sound sources are highly variable, yet audio signals often possess significant statistical structure. There is a great deal of prior knowledge available about why this statistical structure is present. This includes knowledge of the physical mechanisms by which sounds are generated, the cognitive processes by which sounds are perceived and, in the context of music, the abstract mechanisms by which high-level sound structure is compiled. Bayesian hierarchical techniques provide a natural means for unification of these bodies of prior knowledge, allowing the formulation of highly-structured models for observed audio data and latent processes at various levels of abstraction. They also permit the inclusion of desirable modelling components such as change-point structures and model-order specifications. The resulting models exhibit complex statistical structure and in practice, highly adaptive and powerful computational techniques are needed to perform inference. In this chapter, we review some of the statistical models and associated inference methods developed recently for audio and music processing. Our treatment will be biased towards musical signals, yet the modelling strategies and inference techniques are generic and can be applied in a broader context to nonstationary time series analysis. In the chapter we will review application areas for audio processing, describe models appropriate for these scenarios and discuss the computational problems posed by inference in these models. We will describe models in both the time domain and transform domains, the latter typically offering greater computational tractability and modelling flexibility at the expense of some accuracy in the models. Inference in the models is performed using Monte Carlo methods as well as variational approaches originating in statistical physics. We hope to show that this field, which is still in its infancy compared to topics such as computer vision and speech recognition, has great potential for advancement in coming years, with the advent of powerful Bayesian inference methodologies and accompanying computational power increases.

@inproceedings{Cemgil2008BayesianSM,
title={Bayesian Statistical Methods for Audio and Music Processing},
author={A. Taylan Cemgil and Simon J. Godsill and Paul H. Peeling and Nick Whiteley},
year={2008}
}