Open-Unmix - A Reference Implementation for Music Source Separation

  title={Open-Unmix - A Reference Implementation for Music Source Separation},
  author={Fabian-Robert St{\"o}ter and Stefan Uhlich and Antoine Liutkus and Yuki Mitsufuji},
  journal={J. Open Source Softw.},
Music source separation is the task of decomposing music into its constitutive components, e.g., yielding separated stems for the vocals, bass, and drums. Such a separation has many applications ranging from rearranging/repurposing the stems (remixing, repanning, upmixing) to full extraction (karaoke, sample creation, audio restoration). Music separation has a long history of scientific activity as it is known to be a very challenging problem. In recent years, deep learning-based systems-for… 

Figures from this paper

Music Source Separation Using Generative Adversarial Network and U-Net
  • M. Satya, S. Suyanto
  • Physics
    2020 8th International Conference on Information and Communication Technology (ICoICT)
  • 2020
A new model based on a Generative Adversarial Network (GAN) is proposed to separate the music sources to rebuild the sound sources that exist in the music.
User-Guided One-Shot Deep Model Adaptation for Music Source Separation
This work proposes to exploit a temporal segmentation provided by the user, that indicates when each instrument is active, in order to fine-tune a pre-trained deep model for source separation and adapt it to one specific mixture.
Music Demixing Challenge 2021
The Music Demixing Challenge on a crowd-based machine learning competition platform where the task is to separate stereo songs into four instrument stems and the dataset provides a wider range of music genres and involved a greater number of mixing engineers.
Fast accuracy estimation of deep learning based multi-class musical source separation
This work proposes a fast method to evaluate the separability of instruments in any dataset without training and tuning a DNN, an excellent proxy to estimate the separation performances of state-of-the-art deep learning approaches such as TasNet or Open-Unmix.
Research on DNN Methods in Music Source Separation Tools with emphasis to Spleeter
DNN method in music source separation (MSS) tools with emphasis to Spleeter by Deezer, an enhanced deep learning model for music sourceseparation, which is a set of pre-trained model written in python using the Tensorflow machine learning library used for musicsource separation.
A Deep-Learning Based Framework for Source Separation, Analysis, and Synthesis of Choral Ensembles
This paper uses some of the publicly available choral singing datasets to train and evaluate state-of-the-art source separation algorithms from the speech and music domains for the case of ch choir singing, and evaluates existing monophonic F0 estimators on the separated unison stems.
Music Demixing Challenge at ISMIR 2021
The Music Demixing (MDX) Challenge is designed on a crowd-based machine learning competition platform where the task is to separate stereo songs into four instrument stems (Vocals, Drums, Bass, Other).
Enhanced Audio Source Separation and Musical Component Analysis
The proposed system aims to develop a universal platform-independent software for accurate domain-specific implementation of music source separation for acute subsets of stereo audio using the Bidirectional Long Short Term Memory (BLSTM) architecture of Recurrent Neural Networks.
Adding Context Information to Deep Neural Network based Audio Source Separation
A novel self-attention mechanism is proposed, which is able to filter out unwanted interferences and distortions by utilizing the repetitive nature of music.
A method that repurposes deep models trained for music generation and music tagging for audio source separation, without any retraining is presented, pointing to the vast and heretofore untapped potential of large pretrained music models for audio-to-audio tasks like source separation.


Improving music source separation based on deep neural networks through data augmentation and network blending
This paper describes two different deep neural network architectures for the separation of music into individual instrument tracks, a feed-forward and a recurrent one, and shows that each of them yields themselves state-of-the art results on the SiSEC DSD100 dataset.
OpenBliSSART: Design and evaluation of a research toolkit for Blind Source Separation in Audio Recognition Tasks
The toolkit openBliSSART (open-source Blind Source Separation for Audio Recognition Tasks) is described and evaluated, which provides the first open-source implementation of a widely applicable algorithmic framework based on non-negative matrix factorization (NMF), including several preprocessing, factorization, and signal reconstruction algorithms for monaural signals.
Wave-U-Net: A Multi-Scale Neural Network for End-to-End Audio Source Separation
The Wave-U-Net is proposed, an adaptation of the U-Net to the one-dimensional time domain, which repeatedly resamples feature maps to compute and combine features at different time scales and indicates that its architecture yields a performance comparable to a state-of-the-art spectrogram-based U- net architecture, given the same data.
The 2018 Signal Separation Evaluation Campaign
This year's edition of SiSEC was focused on audio and pursued the effort towards scaling up and making it easier to prototype audio separation software in an era of machine-learning based systems, including a new music separation database: MUSDB18.
Open-Source Practices for Music Signal Processing Research: Recommendations for Transparent, Sustainable, and Reproducible Audio Research
Because of an increased abundance of methods, the proliferation of software toolkits, the explosion of machine learning, and a focus shift toward more realistic problem settings, modern research systems are substantially more complex than their predecessors.
Monoaural Audio Source Separation Using Deep Convolutional Neural Networks
A low-latency monaural source separation framework using a Convolutional Neural Network and the performance of the neural network is evaluated on a database comprising of musical mixtures of three instruments as well as other instruments which vary from song to song.
MUSDB18-HQ - an uncompressed version of MUSDB18
MUSDB18-HQ is the uncompressed version of the MUSDB18 dataset. It consists of a total of 150 full-track songs of different styles and includes both the stereo mixtures and the original sources,
A General Flexible Framework for the Handling of Prior Information in Audio Source Separation
This paper introduces a general audio source separation framework based on a library of structured source models that enable the incorporation of prior knowledge about each source via user-specifiable constraints.
The Flexible Audio Source Separation Toolbox Version 2.0
The new version of the FASST toolbox written in C++ is introduced, which provides a number of advantages compared to the first Matlab version: portability, faster computation, simplified user interface, more scripting languages.
The 2011 Signal Separation Evaluation Campaign (SiSEC2011): - Audio Source Separation -
This paper summarizes the audio part of the 2011 community-based Signal Separation Evaluation Campaign (SiSEC2011), including datasets recorded in noisy or dynamic environments and a subset of the SiSEC2010 datasets.