• Corpus ID: 248563064

Investigation of Singing Voice Separation for Singing Voice Detection in Polyphonic Music

  title={Investigation of Singing Voice Separation for Singing Voice Detection in Polyphonic Music},
  author={Yifu Sun and Xulong Zhang and Yi Yu and Xi Chen and W. Li},
Upon the continuity of in the domain, Networks eliminate outliers, a filter for outperforms the state-of-the-art works on two public datasets, the 

Figures and Tables from this paper

MetaSID: Singer Identification with Domain Adaptation for Metaverse

This paper proposes the use of the domain adaptation method to solve the live effect in singer identification and Experimental results on the public dataset of Artist20 show that CRNN-MMD leads to an improvement over the baseline CRNN by 0.14, and theCRNN-RevGrad outperforms the baseline by0.21.

TDASS: Target Domain Adaptation Speech Synthesis Framework for Multi-speaker Low-Resource TTS

The Target Domain Adaptation Speech Synthesis Network (TDASS) is proposed, which is a high-quality TTS model based on the backbone of the Tacotron2 model, and introduces a self-interested classifier for reducing the non-target influence.

SUSing: SU-net for Singing Voice Synthesis

The proposed SU-net for singing voice synthesis named SUSing is treated as a translation task between lyrics and music score and spectrum, and the stripe pooling method is used to replace the alternate globalpooling method to learn the vertical frequency relationship in the spectrum and the changes of frequency in the time domain.

MDCNN-SID: Multi-scale Dilated Convolution Network for Singer Identification

This paper proposes an end-to-end architecture that addresses the problem of wave embedding in the waveform domain and achieves comparable performance on the benchmark dataset of Artist20, which significantly improves related works.



Singing voice detection with deep recurrent neural networks

A new method for singing voice detection based on a Bidirectional Long Short-Term Memory (BLSTM) Recurrent Neural Network (RNN) that is able to take a past and future temporal context into account to decide on the presence/absence of singing voice.

Timbre and Melody Features for the Recognition of Vocal Activity and Instrumental Solos in Polyphonic Music

This paper proposes the task of detecting instrumental solos in polyphonic music recordings, and the usage of a set of four audio features for vocal and instrumental activity detection, using a support vector machine hidden Markov model.

An Overview of Lead and Accompaniment Separation in Music

This article provides a comprehensive review of this research topic, organizing the different approaches according to whether they are model-based or data-centered, and presents the results of the largest evaluation, to-date, of lead and accompaniment separation systems.

Exploring Data Augmentation for Improved Singing Voice Detection with Neural Networks

A range of label-preserving audio transformations are applied and pitch shifting is found to be the most helpful augmentation method for music data augmentation, reaching the state of the art on two public datasets.

Comparing audio descriptors for singing voice detection in music audio files

An effective statistical classification system with a reduced set of descriptors for singing voice detection in music audio files is presented and it is concluded that MFCC are the most appropriate.

Singer Identification Using Deep Timbre Feature Learning with KNN-NET

The proposed KNN-Net for SID is a deep neural network model with the goal of learning local timbre feature representation from the mixture of singer voice and background music, which outperforms the state-of-the-art methods.

Transfer Learning for Improving Singing-voice Detection in Polyphonic Instrumental Music

In this study, clean speech clips with voice activity endpoints and separate instrumental music clips are artificially added together to simulate polyphonic vocals to train a vocal/non-vocal detector and the proposed data augmentation method by transfer learning can improve S-VD performance.

Joint Detection and Classification of Singing Voice Melody Using Convolutional Recurrent Neural Networks

A joint detection and classification network that conducts the singing voice detection and the pitch estimation simultaneously and outperforms state-of-the-art algorithms on the datasets is presented.

On sparse and low-rank matrix decomposition for singing voice separation

To better account for the particular properties of music, two new algorithms are proposed to improve the decomposition, including the incorporation of harmonicity priors and a back-end drum removal procedure.

Singing Voice Separation with Deep U-Net Convolutional Networks

This work proposes a novel application of the U-Net architecture — initially developed for medical imaging — for the task of source separation, given its proven capacity for recreating the fine, low-level detail required for high-quality audio reproduction.