• Corpus ID: 6573918

Evaluation Framework for Automatic Singing Transcription

@inproceedings{Molina2014EvaluationFF,
  title={Evaluation Framework for Automatic Singing Transcription},
  author={Emilio Molina and Ana M. Barbancho and Lorenzo J. Tard{\'o}n and Isabel Barbancho},
  booktitle={ISMIR},
  year={2014}
}
In this paper, we analyse the evaluation strategies used in previous works on automatic singing transcription, and we present a novel, comprehensive and freely available evaluation framework for automatic singing transcription. This framework consists of a cross-annotated dataset and a set of extended evaluation measures, which are integrated in a Matlab toolbox. The presented evaluation measures are based on standard MIREX note-tracking measures, but they provide extra information about the… 
VOCANO: A note transcription framework for singing voice in polyphonic music
TLDR
VOCANO is presented, an open-source VOCAl NOte transcription framework built upon robust neural networks with multi-task and semi-supervised learning that outperforms the state of the arts on public benchmarks over a wide variety of evaluation metrics.
Computer-aided Melody Note Transcription Using the Tony Software: Accuracy and Efficiency
TLDR
Tony, a software tool for the interactive annotation of melodies from monophonic audio recordings, is presented, and it is shown that Tony’s built in automatic note transcription method compares favourably with existing tools.
vocadito: A dataset of solo vocals with f0, note, and lyric annotations
TLDR
This work presents a small dataset entitled vocadito, consisting of 40 short excerpts of monophonic singing, sung in 7 different languages by singers with varying of levels of training, and recorded on a variety of devices.
Automatic Solfège Assessment
TLDR
Experimental results indicate that the classification scheme is suitable to be used as an assessment tool, providing useful feedback to the student, and implemented using a Bayesian classifier.
Omnizart: A General Toolbox for Automatic Music Transcription
TLDR
Omnizart is the first transcription toolkit which offers models covering a wide class of instruments ranging from solo, instrument ensembles, percussion instruments to vocal, as well as models for chord recognition and beat/downbeat tracking, two music information retrieval tasks highly related to AMT.
HEO and LEE : ROBUST SINGING TRANSCRIPTION SYSTEM USING LOCAL HOMOGENEITY IN THE
Automatic music transcription from audio has long been one of the most intriguing problems and a challenge in the field of music information retrieval, because it requires a series of low-level tasks
Toward Expressive Singing Voice Correction: On Perceptual Validity of Evaluation Metrics for Vocal Melody Extraction
TLDR
A streamlined system to automate expressive SVC for both pitch and rhythmic errors is presented, and perceptual validity of the standard metrics through the lens of SVC is investigated, suggesting that the high pitch accuracy obtained by the metrics does not signify good perceptual scores.
Hierarchical Classification Networks for Singing Voice Segmentation and Transcription
TLDR
The proposed method implemented with residual networks provides improved performance over prior art in onset and offset detection, and by integrating with a pitch detection framework, the proposed method also outperforms previous singing voice transcription methods.
Score-Informed Syllable Segmentation for A Cappella Singing Voice with Convolutional Neural Networks
TLDR
This paper introduces a new score-informed method for the segmentation of jingju a cappella singing phrase into syllables that outperforms the state-of-the-art in syllable segmentation for jing Ju a cappa singing.
Score-informed syllable segmentation for Jingju a cappella singing voice with Mel-frequency intensity profiles
This paper introduces a new unsupervised and score-informed method for the segmentation of singing voice into syllables. The main idea of the proposed method is to detect the syllable onset on a
...
1
2
3
...

References

SHOWING 1-10 OF 18 REFERENCES
Towards Computer-Assisted Flamenco Transcription: An Experimental Comparison of Automatic Transcription Algorithms as Applied to A Cappella Singing
TLDR
A transcription system based on fundamental frequency and energy estimation, which incorporates an iterative strategy for note segmentation and labeling is proposed, which outperforms a state-of-the-art approach designed for other singing styles.
Explicit Transition Modelling for Automatic Singing Transcription
TLDR
A system for the automatic transcription of solo human singing into note sequences and Hidden Markov models are used to represent both individual notes and the transitions between them in order to capture the variability of the estimated pitch within a statistical framework.
An Auditory Model Based Transcriber of Singing Sequences
TLDR
A new system for the automatic transcription of singing sequences into a sequence of pitch and duration pairs is presented and it is shown that the accuracy of the newly proposed transcription system is not very to the choice of the free parameters, at least as long as they remain in the vicinity of the values one could forecast on the basis of their meaning.
Recent improvements of an auditory model based front-end for the transcription of vocal queries
TLDR
Experiments have shown that the new system can transcribe vocal queries with an accuracy ranging from 76 % (whistling) to 85 % (humming), and that it clearly outperforms other state-of-the art systems on all three query types.
Fundamental frequency alignment vs. note-based melodic similarity for singing voice assessment
TLDR
The results show that the proposed system is suitable for automatic singing voice rating and that DTW based measures are specially simple and effective for intonation and rhythm assessment.
Modelling of note events for singing transcription
TLDR
The method produces symbolic notations from acoustic inputs based on two probabilistic models: a note event model and a musicological model which form a melody transcription system with a modular architecture which can be extended with desired front-end feature extractors and musicological rules.
Sung Note Segmentation for a Query-by-Humming System
TLDR
New acoustic feats based on the signal energy distribution as obtained from the singing pe rception and production points of view are investigated and a specific mid-band energy combined with a biphasic detection function achieves high co-rect detection and low false alarm rates on the sonorant consonant syllables.
Signal processing for melody transcription
TLDR
The signal processing operations involved in melody transcription system MT are described, and two applications that have been prototyped are discussed: a sightsinging tutor and a scheme for acoustically indexing a melody database.
An Audio Front End for Query-by-Humming Systems
TLDR
A front end dedicated to the symbolic translation of voice into a sequence of pitch and duration pairs is developed, crucial for the effectiveness of searching for music by melodic similarity.
Probabilistic models for the transcription of single-voice melodies
TLDR
A method is proposed for the automatic transcription of single-voice melodies from an acoustic waveform into a symbolic musical notation (a MIDI file) using a probabilistic model that handles imperfections in the performed/estimated pitch values using a hidden Markov model.
...
1
2
...