Dynamic time warping and sparse representation classification for birdsong phrase classification using limited training data.

@article{Tan2015DynamicTW,
  title={Dynamic time warping and sparse representation classification for birdsong phrase classification using limited training data.},
  author={Lee Ngee Tan and Abeer Alwan and George Kossan and Martin L. Cody and Charles E. Taylor},
  journal={The Journal of the Acoustical Society of America},
  year={2015},
  volume={137 3},
  pages={
          1069-80
        }
}
  • L. TanA. Alwan C. Taylor
  • Published 17 March 2015
  • Computer Science, Environmental Science
  • The Journal of the Acoustical Society of America
Annotation of phrases in birdsongs can be helpful to behavioral and population studies. To reduce the need for manual annotation, an automated birdsong phrase classification algorithm for limited data is developed. Limited data occur because of limited recordings or the existence of rare phrases. In this paper, classification of up to 81 phrase classes of Cassin's Vireo is performed using one to five training samples per class. The algorithm involves dynamic time warping (DTW) and two passes of… 

A robust automatic birdsong phrase classification: A template-based approach.

The innovation of this work is that the proposed algorithm is robust to both limited training data and background noise, and outperforms DTW and HMMs in most training and testing conditions, usually with a high margin when the background noise level is high.

A novel deep transfer learning models for recognition of birds sounds in different environment

The current work verifies that deep transfer learning models like ResNet50, DenseNet201, InceptionV3, Xception and Efficient Net can effectively extract and recognize the audio signals from different bird species with significant prediction accuracy.

Bird Song Recognition Based on Deep Transfer Learning with XGBoost

Calculating delta and delta-delta for log-Mel spectrogram to form 3D time-frequency spectrogram based on log-Mels not only preserves species differentiation but also reduces the influence of bird song-independent factors in the actual scene.

Bird sounds classification by large scale acoustic features and extreme learning machine

This study presents a novel framework for bird sounds classification from audio recordings using the p-centre to detect the `syllables' of bird songs, which are the units for the recognition task; then, the openSMILE toolkit is used to extract large scales of acoustic features from chunked units of analysis (the `Syllables').

Robust Automatic Recognition of Birdsongs and Human Speech: a Template-Based Approach

This dissertation focuses on an automatic birdsong-phrase recognition system that is robust to limited training data, class variability, and noise and proposes a new pitch-based spectral enhancement algorithm based on voiced frames for speech analysis and noise-robust speech processing.

Classification of bird song syllables using Wigner-Ville ambiguity function cross-terms

A novel feature extraction method based on the cross-term doppler- and lag profiles for low-dimensional signal representation gives a better classification than established methods used in bird song analysis.

Set A datapoints Set B datapoints Class 1 ( C 1 ) Archetypes

A new classification framework that combines the characteristics of matrix factorization with the discriminative capabilities of kernel methods is introduced and the introduction of AA and DAA into the IMK framework leads to a noticeable improvement in classification accuracy.

Noise-Robust Hidden Markov Models for Limited Training Data for Within-Species Bird Phrase Classification

This paper presents two novel approaches to training continuous and discrete HMMs with extremely limited data by learning the global Gaussian Mixture Models for all training phrases available.

A sparse representation-based classifier for in-set bird phrase verification and classification with limited training data

When evaluated against the nearest subspace (NS) and support vector machine (SVM) classifiers using the same framework, the SR classifier has the highest classification accuracy, due to its good performances in both the verification and classification tasks.

Evaluation of a Sparse Representation-Based Classifier For Bird Phrase Classification Under Limited Data Conditions

The SR classifier outperforms the NS and SVM classifiers, with a maximum absolute improvement of 3.4% observed when there are only four tokens per phrase in the training set.

A robust automatic bird phrase classifier using dynamic time-warping with prominent region identification

A novel approach to birdsong phase classification using template-based techniques suitable even for limited training data and noisy environments and in the presence of additive noise does not degrade significantly, compared to others.

Automated recognition of bird song elements from continuous recordings using dynamic time warping and hidden Markov models: a comparative study.

The performance of two techniques is compared for automated recognition of bird song units from continuous recordings, and one gives excellent to satisfactory performance and the other requires careful selection of templates.

Semi-Automatic Classification of Birdsong Elements Using a Linear Support Vector Machine

Use of a linear-kernel support vector machine was employed to minimize the amount of human-generated label samples for reliable element classification in birdsong, and to enable the classifier to handle highly-dimensional acoustic features while avoiding the over-fitting problem.

Performance tradeoffs in dynamic time warping algorithms for isolated word recognition

The results suggest a new approach to dynamic time warping for isolated words in which both the reference and test patterns are linearly warped to a fixed length, and then a simplified dynamic time Warping algorithm is used to handle the nonlinear component of the time alignment.

Semi-automatic classification of bird vocalizations using spectral peak tracks.

An experiment using computer software to perform peak tracking of spectral analysis data demonstrates the usefulness of the sum-of-sinusoids model for rapid automatic recognition of isolated bird syllables.

Embedding time warping in exemplar-based sparse representations of speech

A novel sparse representation model which allows time warping using a third matrix which linearly combines consecutive frames in order to shrink or expand the approximation is introduced.

Automatic Recognition of Bird Songs Using Time-Frequency Texture

  • Sha-Sha ChenYing Li
  • Computer Science
    2013 5th International Conference on Computational Intelligence and Communication Networks
  • 2013
A new approach for identifying birds automatically from their sounds, which first converts the bird songs into spectrograms and then extracts texture features from this visual time-frequency representation, outperforming the well-known MFCC features.

Continuous Birdsong Recognition Using Gaussian Mixture Modeling of Image Shape Features

A new feature descriptor that uses image shape features is proposed to identify bird species based on the recognition of fixed-duration birdsong segments where their corresponding spectrograms are viewed as gray-level images, better than traditional descriptors such as LPCC, MFCC, and TDMFCC.