Shigenobu Seto

Learn More
Totally Speaker Driven Text to Speech System produces high quality and natural speech resembling the acoustic and prosodic characteristics of the original speech corpus. In the F0 contour control of this system, an F0 contour of a whole sentence is produced by concatenating segmental F0 contours generated by modifying vectors that are representatives of(More)
In this paper the authors propose a fundamental frequency (F0) control model using a representative vector and then propose a method to train the control rules for the model parameters using a speech database. The representative vector is a vector which represents the typical F0 contour for accent phrases, and a set of representative vectors is referred to(More)
The linguistic features analysis for input text plays an important role in achieving natural prosodic control in text-to-speech (TTS) systems. In a conventional scheme, experts refine suspicious if-then rules and change the tree structure manually to obtain correct analysis results when input texts that have been analyzed incorrectly. However, altering the(More)
Toshiba English Text-to-Speech Synthesizer utilizes several new techniques to produce synthesized speech that is more natural-sounding and intelligible than that created by conventional synthesizers. The closed-loop training method creates synthesis units that most closely resemble the training data and are the least susceptible to prosodic distortion noise(More)
  • 1