SiPTH: Singing Transcription Based on Hysteresis Defined on the Pitch-Time Curve

@article{Molina2015SiPTHST,
  title={SiPTH: Singing Transcription Based on Hysteresis Defined on the Pitch-Time Curve},
  author={Emilio Molina and Lorenzo J. Tard{\'o}n and Ana M. Barbancho and Isabel Barbancho},
  journal={IEEE/ACM Transactions on Audio, Speech, and Language Processing},
  year={2015},
  volume={23},
  pages={252-263}
}
In this paper, we present a method for monophonic singing transcription based on hysteresis defined on the pitch-time curve. This method is designed to perform note segmentation even when the pitch evolution during the same note behaves unstably, as in the case of untrained singers. The selected approach estimates the regions in which the chroma is stable, these regions are classified as voiced or unvoiced according to a decision tree classifier using two descriptors based on aperiodicity and… 
Probabilistic transcription of sung melody using a pitch dynamic model
TLDR
This paper presents a probabilistic transcription method for monophonic sung melodies that explicitly accounts for these local pitch fluctuations, with model parameters that support intuitions about singing behavior and compared with state-of-the-art methods.
Automatic Transcription of Flamenco Singing From Polyphonic Music Recordings
  • N. Kroher, E. Gómez
  • Computer Science
    IEEE/ACM Transactions on Audio, Speech, and Language Processing
  • 2016
TLDR
This study first extracts the predominant melody and applies a novel contour filtering process to eliminate segments of the pitch contour which originate from the guitar accompaniment, and formulates a set of onset detection functions based on volume and pitch characteristics to segment the resulting vocal pitch contours into discrete note events.
Discrimination Between Ascending/Descending Pitch Arpeggios
TLDR
The main goal of this paper is to determine the pitch direction in which the arpeggiated chord was played, and the results show that the pitchdirection can be reliably detected using the proposed methods.
Automatic Solfège Assessment
TLDR
Experimental results indicate that the classification scheme is suitable to be used as an assessment tool, providing useful feedback to the student, and implemented using a Bayesian classifier.
HEO and LEE : ROBUST SINGING TRANSCRIPTION SYSTEM USING LOCAL HOMOGENEITY IN THE
Automatic music transcription from audio has long been one of the most intriguing problems and a challenge in the field of music information retrieval, because it requires a series of low-level tasks
Hierarchical Classification Networks for Singing Voice Segmentation and Transcription
TLDR
The proposed method implemented with residual networks provides improved performance over prior art in onset and offset detection, and by integrating with a pitch detection framework, the proposed method also outperforms previous singing voice transcription methods.
Detecting Stable Regions in Frequency Trajectories for Tonal Analysis of Traditional Georgian Vocal Music
TLDR
Two approaches for detecting stable regions in frequency trajectories are described: the first algorithm uses morphological operations inspired by image processing, and the second one is based on suitably defined binary time–frequency masks.
Voiced/Unvoiced Classification and Pitch Estimation Based on Amplitude Compression Filter
A method of voiced/unvoiced classification and pitch estimation based on Pitch Estimation Filter with Amplitude Compression (PEFAC) is proposed in this paper. The method first attenuates strong noise
vocadito: A dataset of solo vocals with f0, note, and lyric annotations
TLDR
This work presents a small dataset entitled vocadito, consisting of 40 short excerpts of monophonic singing, sung in 7 different languages by singers with varying of levels of training, and recorded on a variety of devices.
Automatic Singing Transcription Based on Encoder-decoder Recurrent Neural Networks with a Weakly-supervised Attention Mechanism
TLDR
Neural singing transcription is described that estimates a sequence of musical notes directly from the audio signal of singing voice in an end-to-end manner without time-aligned training data and a novel loss function for the attention weights of time- aligned notes for semi-supervised alignment training is proposed.
...
1
2
3
...

References

SHOWING 1-10 OF 47 REFERENCES
Polyphonic music transcription using note onset and offset detection
  • Emmanouil Benetos, S. Dixon
  • Computer Science
    2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
  • 2011
TLDR
An approach for polyphonic music transcription based on joint multiple-F0 estimation and note onset/offset detection and note offset detection using HMMs trained on MIDI data is proposed.
An Auditory Model Based Transcriber of Singing Sequences
TLDR
A new system for the automatic transcription of singing sequences into a sequence of pitch and duration pairs is presented and it is shown that the accuracy of the newly proposed transcription system is not very to the choice of the free parameters, at least as long as they remain in the vicinity of the values one could forecast on the basis of their meaning.
Fundamental frequency alignment vs. note-based melodic similarity for singing voice assessment
TLDR
The results show that the proposed system is suitable for automatic singing voice rating and that DTW based measures are specially simple and effective for intonation and rhythm assessment.
Melody Extraction From Polyphonic Music Signals Using Pitch Contour Characteristics
  • J. Salamon, E. Gómez
  • Computer Science
    IEEE Transactions on Audio, Speech, and Language Processing
  • 2012
TLDR
A comparative evaluation of the proposed approach shows that it outperforms current state-of-the-art melody extraction systems in terms of overall accuracy.
Towards Computer-Assisted Flamenco Transcription: An Experimental Comparison of Automatic Transcription Algorithms as Applied to A Cappella Singing
TLDR
A transcription system based on fundamental frequency and energy estimation, which incorporates an iterative strategy for note segmentation and labeling is proposed, which outperforms a state-of-the-art approach designed for other singing styles.
An Audio Front End for Query-by-Humming Systems
TLDR
A front end dedicated to the symbolic translation of voice into a sequence of pitch and duration pairs is developed, crucial for the effectiveness of searching for music by melodic similarity.
Modelling of note events for singing transcription
TLDR
The method produces symbolic notations from acoustic inputs based on two probabilistic models: a note event model and a musicological model which form a melody transcription system with a modular architecture which can be extended with desired front-end feature extractors and musicological rules.
A pitch determination and voiced/unvoiced decision algorithm for noisy speech
TLDR
The voiced/unvoiced, unvoiced/voiced performance and pitch estimation errors for the proposed PDA and the reference system while utilising three speech databases are reported in details.
Vocal pitch detection for musical transcription
In this paper, an improved method for the detection of vocal pitch for musical transcription is considered and detected pitch is used as a technique for the extraction of musical data from the given
Explicit Transition Modelling for Automatic Singing Transcription
TLDR
A system for the automatic transcription of solo human singing into note sequences and Hidden Markov models are used to represent both individual notes and the transitions between them in order to capture the variability of the estimated pitch within a statistical framework.
...
1
2
3
4
5
...