• Corpus ID: 24946551

Large-Scale Bird Sound Classification using Convolutional Neural Networks

  title={Large-Scale Bird Sound Classification using Convolutional Neural Networks},
  author={Stefan Kahl and Thomas Wilhelm-Stein and Hussein Hussein and Holger Klinck and Danny Kowerko and Marc Ritter and Maximilian Eibl},
Identifying bird species in audio recordings is a challenging field of research. In this paper, we summarize a method for large-scale bird sound classification in the context of the LifeCLEF 2017 bird identification task. We used a variety of convolutional neural networks to generate features extracted from visual representations of field recordings. The BirdCLEF 2017 training dataset consist of 36.496 audio recordings containing 1500 different bird species. Our approach achieved a mean average… 

Audio-based Bird Species Identification with Deep Convolutional Neural Networks

The proposed approach is evaluated in the BirdCLEF 2018 campaign and provides the best system in all subtasks and surpasses previous state-of-the-art by 15.8 % identifying foreground species and 20.2 % considering also background species.

Bird Sound Classification Using Convolutional Neural Networks

The inception model achieved 0.16 classification mean average precision (c-mAP) and ranked the second place among five teams that successfully submitted their predictions in the BirdCLEF2019 competition.

Detection of Bird Species Through Sounds

This paper converts audio snippets into spectograms and uses a convolutional neural network to classify these images, first built from a baseline CNN architecture, then improved the model with hyperparameter tuning in learning rates, number of epochs and hidden layers, as well as dropout rates for regularization.

A Baseline for Large-Scale Bird Species Identification in Field Recordings

This work discusses the attempt of large-scale bird species identification using the 2018 BirdCLEF baseline system and its implications for future research in acoustic event classification.

Acoustic bird detection with deep convolutional neural networks

This paper presents deep learning techniques for acoustic bird detection and provides the best system for the task and surpasses previous state-of-the-art achieving an area under the curve (AUC) above 95 % on the public challenge leaderboard.

Bird call recognition using deep convolutional neural network, ResNet-50

This paper uses ResNet-50, a deep convolutional neural network architecture for automated bird call recognition in acoustic recordings, and uses a publicly available dataset consisting of calls from 46 different bird species to achieve 60%-72% accuracy of birdcall recognition.

Cross-domain Deep Feature Combination for Bird Species Classification with Audio-visual Data

This paper proposes CNN-based multimodal learning models in three types of fusion strategies (early, middle, late) to settle the issues of combining training data cross domains and shows that transfer learning can significantly increase the classification performance.

Construction and Improvements of Bird Songs' Classification System

This work trained convolutional neural network with both spectrograms extracted from recordings and additionally provided metadata for soundscape situation, and applied bird event detection to reduce false alarm.

Bird Sound Recognition Using a Convolutional Neural Network

Results suggest that choosing a color map in line with the images the network has been pre-trained with provides a measurable advantage, and the presented system is viable only for a low number of classes.



Recognizing Bird Species in Audio Recordings using Deep Convolutional Neural Networks

This paper summarizes a method for purely audio-based bird species recognition through the application of convolutional neural networks, evaluated in the context of the LifeCLEF 2016 bird identification task an open challenge conducted on a dataset representing 999 bird species from South America.

LifeCLEF Bird Identification Task 2017

An overview of the systems developed by the five participating research groups, the methodology of the evaluation of their performance, and an analysis and discussion of the results obtained are reported.

Bird Song Classification in Field Recordings: Winning Solution for NIPS4B 2013 Competition *

The goal of the NIPS4B competition is to identify which of the 87 sound classes of birds and amphibians are present in 1000 continuous wildlife recordings, using only the provided audio files and machine learning algorithms for automatic pattern recognition.

AENet: Learning Deep Audio Features for Video Analysis

A convolutional neural network operating on a large temporal input allows for an audio event detection system end to end and performs transfer learning and shows that the model learned generic audio features, similar to the way CNNs learn generic features on vision tasks.

Going deeper with convolutions

We propose a deep convolutional neural network architecture codenamed Inception that achieves the new state of the art for classification and detection in the ImageNet Large-Scale Visual Recognition

Learning Spatiotemporal Features with 3D Convolutional Networks

The learned features, namely C3D (Convolutional 3D), with a simple linear classifier outperform state-of-the-art methods on 4 different benchmarks and are comparable with current best methods on the other 2 benchmarks.

ImageNet classification with deep convolutional neural networks

A large, deep convolutional neural network was trained to classify the 1.2 million high-resolution images in the ImageNet LSVRC-2010 contest into the 1000 different classes and employed a recently developed regularization method called "dropout" that proved to be very effective.

Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification

This work proposes a Parametric Rectified Linear Unit (PReLU) that generalizes the traditional rectified unit and derives a robust initialization method that particularly considers the rectifier nonlinearities.

Deep Residual Learning for Image Recognition

This work presents a residual learning framework to ease the training of networks that are substantially deeper than those used previously, and provides comprehensive empirical evidence showing that these residual networks are easier to optimize, and can gain accuracy from considerably increased depth.

Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks

This work introduces a Region Proposal Network (RPN) that shares full-image convolutional features with the detection network, thus enabling nearly cost-free region proposals and further merge RPN and Fast R-CNN into a single network by sharing their convolutionAL features.