Improved multiple birdsong tracking with distribution derivative method and Markov renewal process clustering

  title={Improved multiple birdsong tracking with distribution derivative method and Markov renewal process clustering},
  author={Dan Stowell and Saso Musevic and Jordi Bonada and Mark D. Plumbley},
  journal={2013 IEEE International Conference on Acoustics, Speech and Signal Processing},
Segregating an audio mixture containing multiple simultaneous bird sounds is a challenging task. However, birdsong often contains rapid pitch modulations, and these modulations carry information which may be of use in automatic recognition. In this paper we demonstrate that an improved spectrogram representation, based on the distribution derivative method, leads to improved performance of a segregation algorithm which uses a Markov renewal process model to track vocalisation patterns… 

Figures from this paper

Acoustic detection of multiple birds in environmental audio by Matching Pursuit

We describe a submission to the ICML 2013 Bird Challenge, in which we explore the use of sparse representations as an advance on the standard technique of cross-correlation template matching in

Modeling Lemur vocalizations from a signal processing perspective

This thesis has created acoustic representations of black and white ruffed lemur vocalizations, basing on signal processing techniques, and has introduced Hidden Markov Models framework, approach which has been very successful in the context of speech synthesis.

Segregating event streams and noise with a Markov renewal process model

An approach to inferring structure in timestamped data produced by a mixture of an unknown and varying number of similar Markov renewal processes, plus independent clutter noise, is developed.

Feature design for multilabel bird song classification in noise ( NIPS 4 B challenge )

Our submission to the challenge therefore focusses on feature design for two goals: noise robustness, and the representation of temporal structure. We first analyse each sound file into basic

Bird Identification Based On the Sound Using Neural Networks

A neu- ral network model in combination of machine learning classifiers to identify the bird call in the audio file by directing the model to pay close attention to the spectrogram images extracted from the audio clippings of birds.

Call recognition and individual identification of fish vocalizations based on automatic speech recognition: An example with the Lusitanian toadfish.

A pattern recognition methodology based on hidden Markov models is presented inspired by successful results obtained in the most widely known and complex acoustical communication signal: human speech, and proved to be a powerful tool to assess signal durations in large data sets.

Dynamic time warping and sparse representation classification for birdsong phrase classification using limited training data.

An automated birdsong phrase classification algorithm for limited data is developed and achieves the highest classification accuracies of 94% and 89% on manually segmented and automatically segmented phrases, respectively, from unseen Cassin's Vireo individuals, using five training samples per class.

Automatic detection, classification, and quantification of sciaenid fish calls in an estuarine soundscape in the Southeast United States

A feature-based, signal detector for sciaenid fish calls was designed and tested and the efficacy of this detector against manually reviewed data was found to be significant and precision varied from 61% to 100%.

Denoising without access to clean data using a partitioned autoencoder

This work introduces a method to train an autoencoder using only noisy data, having examples with and without the signal class of interest, and illustrates the method by denoising birdsong audio (available abundantly in uncontrolled noisy datasets) using a convolutional autoen coder.


It is shown that accurate estimation can be obtained when the sets of atoms for which the inner product is large are disjoint, and a technique for designing once-differentiable approximations to windows is presented.



Framewise heterodyne chirp analysis of birdsong

A single-scale chirp analysis (computationally accelerated using FFT) which can be treated as an ordinary time-series is developed and it is shown that it can lead to improved species classification from birdsong.

Acoustic classification of multiple simultaneous bird species: a multi-instance multi-label approach.

This work formulates the problem of classifying the set of species present in an audio recording using the multi-instance multi-label (MIML) framework for machine learning, and proposes a MIML bag generator for audio, i.e., an algorithm which transforms an input audio signal into a bag-of-instances representation suitable for use with M IML classifiers.

Segregating event streams and noise with a Markov renewal process model

An approach to inferring structure in timestamped data produced by a mixture of an unknown and varying number of similar Markov renewal processes, plus independent clutter noise, is developed.

Variational Inference in Non-negative Factorial Hidden Markov Models for Efficient Audio Source Separation

This work presents a Bayesian variant of the N-FHMM suited to an efficient variational inference algorithm, whose complexity is linear in the number of sound sources.

Birdsong and C4DM: A survey of UK birdsong and machine recognition for music researchers

This technical report reviews the state of the art in machine recognition of UK birdsong, and developed a prototype Vamp plugin for birdsong clustering and segmentation, which is described.

Improving the readability of time-frequency and time-scale representations by the reassignment method

The reassignment method, first applied by Kodera, Gendrin, and de Villedary (1976) to the spectrogram, is generalized to any bilinear time-frequency or time-scale distribution.

A comparative study in birds: call-type-independent species and individual recognition using four machine-learning methods and two acoustic features

The results of this experiment indicated that support vector machines (SVM) performed best generally, regardless of which acoustic feature was used, linear predictive coefficients (LPCs) increased the recognition accuracies of hidden Markov models (HMM) greatly, and the most appropriate classifiers for LPCs and Mel-frequency cepstral coefficients (MFCCs) were HMM and SVM respectively.


This paper reviews the derivative method and explores its capac� ity for estimating timevarying sinusoids of compli cated parame� ter variations. The method is reformulated on a generalized sig� nal

A continental-scale tool for acoustic identification of European bats

iBatsID is the first freely available and easily accessible continental- scale bat call classifier, providing the basis for standardized, continental acoustic bat monitoring in Europe, and can provide key information to managers and conservation planners on distribution changes and changes in bat species activity through time.

Neighbours' talk: interspecific choruses among songbirds

It is proposed that evolution has selected individuals that are more skilled at learning heterospecific songs, where such choruses are the expression of the existence of a neighbourhood of different species that has evolved a communal signal that is similar to that used by mates in a pair.