Ranniery Maia

Learn More
This paper describes a trainable excitation approach to eliminate the unnaturalness of HMM-based speech synthesizers. During the waveform generation part, mixed excitation is constructed by state-dependent filtering of pulse trains and white noise sequences. In the training part, filters and pulse trains are jointly optimized through a procedure which(More)
This paper describes the development of a Brazilian Portuguese text-to-speech system which applies a technique wherein speech is directly synthesized from hidden Markov models. In order to build the synthesizer a speech database was recorded and phonetically segmented. Furthermore, contextual informa-tions about syllables, words, phrases, and utterances(More)
In this paper we introduce a new cepstral coefficient extraction method based on an intelligibility measure for speech in noise, the Glimpse Proportion measure. This new method aims to increase the intelligibility of speech in noise by modifying the clean speech, and has applications in scenarios such as public announcement and car navigation systems. We(More)
It has recently been shown that deep neural networks (DNN) can improve the quality of statistical parametric speech synthesis (SPSS) when using a source-filter vocoder. Our own previous work has furthermore shown that a dynamic sinu-soidal model (DSM) is also highly suited to DNN-based SPSS, whereby sinusoids may either be used themselves as a " direct(More)
Statistical parametric synthesizers usually rely on a simplified model of speech production where a minimum-phase filter is driven by a zero or random phase excitation signal. However, this procedure does not take into account the natural mixed-phase characteristics of the speech signal. This paper addresses this issue by proposing the use of the complex(More)
This paper introduces a novel excitation approach for speech synthesizers in which the final waveform is generated through parameters directly obtained from Hidden Markov Models (HMMs). Despite the attractiveness of the HMM-based speech synthesis technique, namely utilization of small corpora and flexibility concerning the achievement of different voice(More)
This paper describes speech intelligibility enhancement for Hidden Markov Model (HMM) generated synthetic speech in noise. We present a method for modifying the Mel cepstral coefficients generated by statistical parametric models that have been trained on plain speech. We update these coefficients such that the glimpse proportion – an objective measure of(More)
This paper presents a fixed-and low-dimensional, perceptually based dynamic sinusoidal model of speech referred to as PDM (Perceptual Dynamic Model). To decrease and fix the number of sinusoidal components typically used in the standard sinusoidal model, we propose to use only one dynamic sinusoidal component per critical band. For each band, the si-nusoid(More)
This paper applies a dynamic sinusoidal synthesis model to statistical parametric speech synthesis (HTS). For this, we utilise regularised cepstral coefficients to represent both the static amplitude and dynamic slope of selected sinusoids for statistical modelling. During synthesis, a dynamic sinusoidal model is used to reconstruct speech. A preference(More)