• Corpus ID: 17345450

VOCALOID - commercial singing synthesizer based on sample concatenation

  title={VOCALOID - commercial singing synthesizer based on sample concatenation},
  author={Hideki Kenmochi and Hayato Ohshita},
The song submitted here to the “Synthesis of Singing Challenge” is synthesized by the latest version of the singing synthesizer “Vocaloid”, which is commercially available now. In this paper, we would like to present the overview of Vocaloid, its product lineups, description of each component, and the synthesis technique used in Vocaloid. Index Terms: singing synthesis 

Figures from this paper

Applying voice conversion to concatenative singing-voice synthesis

This work address the application of Voice Conversion to singing-voice by applying the GMM-based approach to VOCALOID, a concatenative singing synthesizer, to perform singer timbre conversion, achieving a satisfactory conversion effect on the synthesized utterances.


A singing synthesis system that automatically estimates parameters for singing synthesis from a user's singing voice with the help of song lyrics, and has functions to help modify the user’s singing by correcting off-pitch phrases or changing vibrato.

Transferring Vocal Expression of F0 Contour Using Singing Voice Synthesizer

Experiments demonstrated that the proposed system can transfer the vocal expressions while retaining singer's individuality on two singing voice synthesizers: the Vocaloid and the CeVIO.

A singing style modeling system for singing voice synthesizers

In this system, singing expression parameters consisting of melody and dynamics which are derived from F0 and power are modeled by context-dependent Hidden Markov Models (HMMs) and generated parameters from the trained models may be applied to many of them.

Resurrecting past singers : Non-Parallel Singing-Voice Conversion

The proposed nonparallel framework resulted in a performance close to the one following the traditional approach based on GMM and paired data, and a successful perception of the past singer’s timbre on the singing-voice utterances performed by VOCALOID.

Rap-style Singing Voice Synthesis

An HMM-based singing voice synthesis system is used to realize an automatic synthesis of realistic rap-style singing and glissando phenomenon which is special for the style could be found in synthesis results.

Fundamental Frequency Modulation in Singing Voice Synthesis

A probabilistic function that provides natural sounding low frequency f0 modulation to synthesized singing voices is presented and the perceptual relevance is evaluated with subjective listening tests.

Singing Voice Synthesis: Singer-Dependent Vibrato Modeling and Coherent Processing of Spectral Envelope

A modeling technique for singers’ vibratos is proposed, followed by a joint processing on vibrato and spectral envelope, such that these attributes are consistent, and the synthetic singing outputs are found to have similar quality as the human singing.

Singing Voice Synthesis: Singer-dependent Vibrato Modeling and Coherent Processing of Spectral Envelope

Pleasant singing voice is often ornamented by vibrato. This pitch fluctuation acts as a distinctive feature for singing and promotes voice quality. Nevertheless, independent pitch processing in

Speech-to-Singing Voice Conversion: The Challenges and Strategies for Improving Vocal Conversion Processes

Speech-to-singing (STS) conversion is the task of converting the read lyrics of a song, spoken in natural manner, to proper singing, while retaining the linguistic content and the speaker's identity.



Singing Voice Synthesis Combining Excitation plus Resonance and Sinusoidal plus Residual Models

With this approach a complete singing voice synthesizer is developed that generates a vocal melody out of the score and the phonetic transcription of a song.

Sample-based singing voice synthesizer by spectral concatenation

The singing synthesis system we present generates a performance of an artificial singer out of the musical score and the phonetic transcription of a song using a frame-based frequency domain

Spectral Approach to the Modeling of the Singing Voice

Comunicacio presentada a la 111th Audio Engineering Society Convention, que va tenir lloc del 30 de novembre al 3 de desembre de 2001 a Nova York, Estats Units.

Bark and ERB bilinear transforms

A closed-form weighted-equation-error method is derived that computes the optimal mapping coefficient as a function of sampling rate, and the solution is shown to be generally indistinguishable from the optimal least-squares solution.