librosa: Audio and Music Signal Analysis in Python

  title={librosa: Audio and Music Signal Analysis in Python},
  author={Brian McFee and Colin Raffel and Dawen Liang and Daniel P. W. Ellis and Matt McVicar and Eric Battenberg and Oriol Nieto},
This document describes version 0.4.0 of librosa: a Python pack- age for audio and music signal processing. At a high level, librosa provides implementations of a variety of common functions used throughout the field of music information retrieval. In this document, a brief overview of the library's functionality is provided, along with explanations of the design goals, software development practices, and notational conventions. 

Figures and Tables from this paper

Digital Audio Processing Tools for Music Corpus Studies
This chapter summarises the types of information that can be extracted from audio as well as currently available audio tools for music corpus studies and includes both a primer on signal processing and background theory on audio feature extraction.
Open-Source Practices for Music Signal Processing Research: Recommendations for Transparent, Sustainable, and Reproducible Audio Research
Because of an increased abundance of methods, the proliferation of software toolkits, the explosion of machine learning, and a focus shift toward more realistic problem settings, modern research systems are substantially more complex than their predecessors.
Omnizart: A General Toolbox for Automatic Music Transcription
Omnizart is the first transcription toolkit which offers models covering a wide class of instruments ranging from solo, instrument ensembles, percussion instruments to vocal, as well as models for chord recognition and beat/downbeat tracking, two music information retrieval tasks highly related to AMT.
madmom: A New Python Audio and Music Signal Processing Library
Madmom is an open-source audio processing and music information retrieval (MIR) library written in Python that features a concise, NumPy-compatible, object oriented design with simple calling conventions and sensible default values for all parameters that facilitates fast prototyping of MIR applications.
PyTSMod: a Python Implementation of Time-scale Modification Algorithms
PyTSMod is an open-source Python library that implements several different classical TSM algorithms, and it is expected that PyTSMod can help MIR and audio researchers easily use the T SM algorithms in the Python-based environment.
Sync Toolbox: A Python Package for Efficient, Robust, and Accurate Music Synchronization
A Python package called Sync Toolbox is introduced that provides open-source reference implementations for full-fledged music synchronization pipelines and yields state-of-the-art alignment results for a wide range of Western music.
Automatic Music Transcription: Generating MIDI From Audio
This work constructed a bidirectional LSTM network with convolutional and pooling layers in an attempt to symbolically classify audio representations of classical piano music and believes its methods are basically sound.
PySOX: Leveraging the Audio Signal Processing Power of SOX in Python
SoX is a popular command line tool for sound processing. Among many other processes, it allows users to perform a repeated process (e.g. file conversion) over a large batch of audio files and apply a
Interactive Learning of Signal Processing Through Music
It is shown how music may serve as a vehicle to support education in signal processing, and how software tools originally developed for music analysis provide students multiple entry points to delve deeper into classical signal processing techniques, while bridging the gap between education and cutting-edge research.
Visual Display and Retrieval of Music Information
Computational methods for the visual display and analysis of music information retrieval and use cases where the described software, descriptors and visualizations are showcased are showcased.


Signal Processing Methods for Music Transcription
This book serves as an ideal starting point for newcomers and an excellent reference source for people already working in the field and could be used as a textbook for advanced courses in music signal processing.
Sonic visualiser: an open source application for viewing, analysing, and annotating music audio files
Sonic Visualiser is a friendly and flexible end-user desktop application for analysis, visualisation, and annotation of music audio files that has a user interface that resembles familiar audio editing applications, a set of useful standard visualisation facilities, and support for a plugin format for additional automated analysis methods.
Detecting harmonic change in musical audio
Initial experiments show that the algorithm can successfully detect harmonic changes such as chord boundaries in polyphonic audio recordings.
Beat Tracking with Dynamic Programming
This paper describes the beat-tracking system, which operates by first estimating a global tempo, then using dynamic programming to find the best sequence of beat times through the whole piece that both places beats on moments of high ‘onset strength’, as well as maintaining a spacing between beats that agrees with the global tempo.
Essentia: An Audio Analysis Library for Music Information Retrieval
Comunicacio presentada a la 14th International Society for Music Information Retrieval Conference, celebrada a Curitiba (Brasil) els dies 4 a 8 de novembre de 2013.
Harmonic/Percussive Separation Using Median Filtering
In this paper, we present a fast, simple and effective method to separate the harmonic and percussive parts of a monaural audio signal. The technique involves the use of median filtering on a
This paper proposes a computationally efficient method for computing the constant-Q transform (CQT) of a timedomain signal. CQT refers to a time-frequency representation where the frequency bins are
Music type classification by spectral contrast feature
The octave-based spectral contrast feature is proposed to represent the spectral characteristics of a music clip and represented the relative spectral distribution instead of average spectral envelope.
Unsupervised Detection of Music Boundaries by Time Series Structure Features
This paper proposes an unsupervised method for boundary detection, combining three basic principles: novelty, homogeneity, and repetition, which is applicable to a wide range of time series beyond the music and audio domains.
Selective Sampling for Beat Tracking Evaluation
In this paper, we propose a method that can identify challenging music samples for beat tracking without ground truth. Our method, motivated by the machine learning method “selective sampling,” is