Acoustic Vowel Analysis in a Mexican Spanish HMM-based Speech Synthesis

@article{CotoJimnez2014AcousticVA,
  title={Acoustic Vowel Analysis in a Mexican Spanish HMM-based Speech Synthesis},
  author={Marvin Coto-Jim{\'e}nez and Fabiola Mart{\'i}nez Licona and John Goddard Close},
  journal={Res. Comput. Sci.},
  year={2014},
  volume={86},
  pages={53-62}
}
The synthetic voice produced from an HMM-based system is often reported as sounding muffled when it is compared to natural speech. There are several reasons for this effect: some precise and fine characteristics of the natural speech are removed, minimized or hidden in the modeling phase of the HMM system; the resulting speech-parameter trajectories become oversmoothed versions of the speech waveforms. In order to obtain more natural synthetic voices, different training conditions must be tried… Expand
3 Citations
Acoustic Vowel Analysis in a Mexican Spanish HMM-based Speech Synthesis
The synthetic voice produced from an HMM-based system is often reported as sounding mued when it is compared to natural speech. There are several reasons for this eect: some precise and neExpand
Speech Synthesis Based on Hidden Markov Models and Deep Learning
TLDR
The results indicate that HMM-voices can be improved using this approach in its spectral characteristics, but additional research should be conducted to improve other parameters of the voice signal, such as energy and fundamental frequency, to obtain more natural sounding voices. Expand
Análisis acústico de vocales de niños costarricenses
En este articulo se presenta un analisis acustico de vocales pronunciadas por ninos costarricenses con edades entre los 6 y los 12 anos. Estos analisis tienen como objetivo lograr una mejorExpand

References

SHOWING 1-10 OF 21 REFERENCES
Evaluation of objective measures for intelligibility prediction of HMM-based synthetic speech in noise
TLDR
Three intelligibility measures, the Dau measure, the glimpse proportion and the Speech Intelligibility Index gave less accurate predictions of intelligibility for synthetic speech than have previously been found for natural speech; in particular the SII measure. Expand
Prediction of Perceived Sound Quality of Synthetic Speech
This paper investigates the performance of objective speech and audio quality measures for the prediction of perceived sound quality of synthetic speech. A number of existing quality measures haveExpand
Speech parameter generation algorithms for HMM-based speech synthesis
This paper derives a speech parameter generation algorithm for HMM-based speech synthesis, in which the speech parameter sequence is generated from HMMs whose observation vector consists of aExpand
Instrumental Assessment of Prosodic Quality for Text-to-Speech Signals
TLDR
The results highlight a strong potential for instrumental estimation techniques of TTS quality, with the Fo slope within voiced segments proving particularly useful when integrated in a nonlinear fashion, whereas measures of durational variation perform comparably weak. Expand
The HMM-based speech synthesis system (HTS) version 2.0
TLDR
This paper describes HTS version 2.0 in detail, as well as future release plans, which include a number of new features which are useful for both speech synthesis researchers and developers. Expand
Comparison of approaches for instrumentally predicting the quality of text-to-speech systems
TLDR
The results show that auditory quality judgments can in many cases be predicted with a sufficiently high accuracy and reliability, but that there are considerable differences, mainly between male and female speech samples. Expand
An objective measure for estimating MOS of synthesized speech
This paper proposes an average concatenative cost function as the objective measure for naturalness of synthesized speech. All its seven component-costs can be derived directly from the input textExpand
Speech Synthesis Based on Hidden Markov Models
This paper gives a general overview of hidden Markov model (HMM)-based speech synthesis, which has recently been demonstrated to be very effective in synthesizing speech. The main advantage of thisExpand
Statistical Parametric Speech Synthesis
  • H. Zen, K. Tokuda, A. Black
  • Computer Science
  • 2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07
  • 2007
TLDR
This paper gives a general overview of techniques in statistical parametric speech synthesis, and contrasts these techniques with the more conventional unit selection technology that has dominated speech synthesis over the last ten years. Expand
Analysis of fundamental frequency, jitter, shimmer and vocal intensity in children with phonological disorders.
TLDR
F0--vowel /e/ was smaller, on average, in the Phonological Disorder Group and it was 126 Hz in the Control Group, and there was difference between the two groups regarding the means of intensity of vowels /a/, /e/, /i/ and /i/, smaller in the phonological disorder group. Expand
...
1
2
3
...