Hiroya Hashimoto

Learn More
Frame-by-frame representation is not appropriate for prosodic features, which are tightly related to speech units spreading a wide time span, such as words, phrases and so on. This causes an inherit problem in fundamental frequency (F0) contour generation by HMM-based speech synthesis. Our formerlydeveloped method, which modify generated F0 contours in the(More)
A text typed to a speech synthesizer is generally converted into its corresponding phoneme sequence on which various kinds of prosodic symbols are attached by a prosody prediction module. By using this module effectively, we build a prosodic reading tutor of Japanese, called Suzuki-kun, and it is provided as one function of OJAD (Online Japanese Accent(More)
This paper introduces the first online and free framework for teaching and learning Japanese prosody including word accent and phrase intonation. This framework is called OJAD (Online Japanese Accent Dictionary) [1] and it provides three functions. 1) Visual, auditory, systematic, and comprehensive illustration of patterns of accent change (accent sandhi)(More)
Generation process model of fundamental frequency (F<sub>0</sub>) contours is known to represent global movements of F<sub>0</sub>'s keeping a clear relation with linguistic information of utterances. While HMM-based speech synthesis can generate a good quality of speech, problems, which arise from frame-by-frame processing, are pointed out. These problems(More)
A QT/QTc study was conducted in compliance with ICH E14 guideline to evaluate the effects of a new xanthine oxidase inhibitor topiroxostat in Japanese healthy subjects. Forty-eight Japanese healthy subjects (males 24; females 24) received a single oral dose of topiroxostat (60 or 180 mg), moxifloxacin (400 mg) or placebo in a single-center, double-blind,(More)
Generation process model of fundamental frequency (F0) contours can well represent F0 movements of speech keeping a clear relation with linguistic information of utterances. Therefore, by using the model, improvement of HMM-based speech synthesis is expected. One of major problems preventing the use of the model is that the performance of automatic(More)
A method was developed to adapt prosody to a new speaker/style in speech synthesis. It is based on predicting differences between target and original speakers/styles and applying them to the original one. Differences in fundamental frequency (F0) contours are represented in the framework of the generation process model; differences in the command(More)
Generation process model of fundamental frequency contours is ideal to represent global features of prosody. It is a command response model, where the commands have clear relations with linguistic and para/non linguistic information conveyed by the utterance. Therefore, by handling fundamental frequency contours in the framework of the generation process(More)
Statistical parametric speech synthesis technologies, such as HMM-based and DNN-based ones, gain special attention from researchers because of their ability in generating speech in various voice qualities and styles. In these methods, all acoustic parameters (except durational ones) are handled in a frame-by-frame manner, which is not appropriate for(More)
This paper develops an online and freely available framework to aid teaching and learning the prosodic control of Tokyo Japanese: how to generate its adequate word accent and phrase intonation. This framework is called OJAD (Online Japanese Accent Dictionary) [1] and it provides three features. 1) Visual, auditory, systematic, and comprehensive illustration(More)