• Corpus ID: 17816355

Singer Identification in Polyphonic Music Using Vocal Separation and Pattern Recognition Methods

  title={Singer Identification in Polyphonic Music Using Vocal Separation and Pattern Recognition Methods},
  author={Annamaria Mesaros and Tuomas Virtanen and Anssi Klapuri},
This paper evaluates methods for singer identification in polyphonic music, based on pattern classification together with an algorithm for vocal separation. Classification strategies include the discriminant functions, Gaussian mixture model (GMM)-based maximum likelihood classifier and nearest neighbour classifiers using Kullback-Leibler divergence between the GMMs. A novel method of estimating the symmetric Kullback-Leibler distance between two GMMs is proposed. Two different approaches to… 
Automatic singer identification based on auditory features
This system is demonstrated to improve the performance of an automatic singer identification system in Music Information Retrieval (MIR) and introduces the Gaussian Mixture Model (GMM) to model the singers' voice.
Voice singer detection in polyphonic music
  • H. Ezzaidi, M. Bahoura
  • Computer Science
    2009 16th IEEE International Conference on Electronics, Circuits and Systems - (ICECS 2009)
  • 2009
A set of classification techniques based on features extracted from the auditory models, which are commonly used in the speech and speaker recognition domains, are investigated and it is observed that certain approaches are more appropriate for tracking the singer, while others are moreappropriate for detecting the transition from music to the singer and vice versa.
On the Importance of Audio-Source Separation for Singer Identification in Polyphonic Music
This work extracts the singing vocals from polyphonic songs using Wave-U-Net based audio-source separation approach and establishes the role of considered method in application to singer identification.
Singer Identification Based on Artificial Neural Network
A unique features extraction algorithm for singer identification with the highest training and testing accuracy have been achieved through this algorithm is 98.3% and 88.6%.
Singing voice identification and lyrics transcription for music information retrieval invited paper
  • A. Mesaros
  • Computer Science
    2013 7th Conference on Speech Technology and Human - Computer Dialogue (SpeD)
  • 2013
The results show that classification of singing voices can be done robustly in polyphonic music when using source separation, and a system for automatic alignment of lyrics and audio is presented, with sufficient performance for facilitating applications such as automatic karaoke annotation or song browsing.
Statistical and Neural Classifiers: Application for Singer and Music Discrimination in Polyphonic Music Context
Three classification techniques: Linde-Buzo-Gray algorithm (LBG), Gaussian Mixture Models (GMM) and feed-forward Multi-Layer Perception (MLP) are presented and compared and the best results are obtained with the GMM.
Popular singer identification based on cepstrum transformation
This study proposes a background music removal approach for singer identification (SID) by exploiting the underlying relationships between solo voices and their accompanied versions in cepstrum, and shows that such a backgroundMusic removal approach improves the SID accuracy noticeably.
Sparse Modeling for Artist Identification: Exploiting Phase Information and Vocal Separation
It is argued that the phase information, which is usually overlooked in the literature, is also informative in modeling the voice timbre of a singer, given the necessary processing techniques.
Singer and music discrimination based threshold in polyphonic music
The problem of identifying sections of singer voice and instrument signals is addressed in this paper and simple and efficient threshold-based distance measurements for discrimination are used.
Vocal timbre analysis using latent Dirichlet allocation and cross-gender vocal timbre similarity
A method for estimating cross-gender vocal timbre similarity by generating pitch-shifted (frequency-warped) signals of every singing voice and experimental results for a cross- gender singer retrieval task showed that the method discovered interesting similar pitch- shifted singers.


Singer Identification Based on Accompaniment Sound Reduction and Reliable Frame Selection
A method for automatic singer identification from polyphonic musical audio signals including sounds of various instruments that was able to reduce the influences of accompaniment sounds and achieved an accuracy of 95%, while the accuracy for a conventional method was 53%.
System and Method for Automatic Singer Identification
A system for automatic singer identification is developed which recognizes the singer of a song by analyzing the music signal and assigning a song to the model having the best match.
Singing Voice Separation from Monaural Recordings
A system to separate singing voice from music accompaniment from monaural recordings is proposed and Quantitative results show that the system performs well in singing voice separation.
Transcription of the Singing Melody in Polyphonic Music
The method is based on multiple-F0 estimation followed by acoustic and musicological modeling, which produces a sequence of notes and rests as a transcription of the singing melody.
Song-Level Features and Support Vector Machines for Music Classification
This paper describes a new system, tested on the task of artist identification, that uses support vector machines to classify songs based on features calculated over their entire lengths, showing that this classifier outperforms similar classifiers that use only SVMs or song-level features.
One microphone singing voice separation using source-adapted models
A new adaptation method consisting in a filter adaptation technique via the maximum likelihood linear regression (MLLR) is presented with an associated filter-adapted training phase to improve separation quality.
Speech utterance clustering based on the maximization of within-cluster homogeneity of speaker voice characteristics.
  • W. Tsai, H. Wang
  • Computer Science
    The Journal of the Acoustical Society of America
  • 2006
The proposed clustering method begins by specifying a certain number of clusters, corresponding to one of the possible speaker population sizes, and then maximizes the level of overall within-cluster homogeneity of the speakers' voice characteristics.
Segregation of speakers for speech recognition and speaker identification
  • H. Gish, M. Siu, J. R. Rohlicek
  • Physics
    [Proceedings] ICASSP 91: 1991 International Conference on Acoustics, Speech, and Signal Processing
  • 1991
A method for segregating speech from speakers engaged in dialogs employs a distance measure between speech segments used in conjunction with a clustering algorithm to perform the segregation.
Approximating the Kullback Leibler Divergence Between Gaussian Mixture Models
  • J. Hershey, P. Olsen
  • Computer Science
    2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07
  • 2007
Two new methods, the variational approximation and the Variational upper bound, are introduced and compared to existing methods and the benefits of each one are considered and the performance of each is evaluated through numerical experiments.
Singer Identification in Popular Music using Warped Linear Prediction
The research presented in this paper attempts to automatically establish the identity of a singer using acoustic features extracted from songs in a database of popular music.