Neural network based generation of fundamental frequency contours
@article{Scordilis1989NeuralNB, title={Neural network based generation of fundamental frequency contours}, author={Michael S. Scordilis and John N. Gowdy}, journal={International Conference on Acoustics, Speech, and Signal Processing,}, year={1989}, pages={219-222 vol.1} }
Although a number of algorithms exist for the generation of the fundamental frequency contour in automatic text-to-speech conversion systems, the absence of a general theory of intonation still prevents the correct derivation of this important feature in unrestricted text applications. A parallel distributed approach is presented in which two neural networks were designed to learn the F0 values for each phoneme and the F0 fluctuations within each phoneme for words that correspond to a small…
53 Citations
Fundamental Frequency Modeling for Neural-Network-Based Statistical Parametric Speech Synthesis
- 2018
Computer Science
This thesis treats F0 modeling as a sequential conversion problem where the input linguistic feature sequence is converted by a neural F0 model into an F0 contour frame by frame.
Neural network-based F0 text-to-speech synthesiser for Mandarin
- 1994
Computer Science
A neural-network-based approach to synthesising FO information for Mandarin text-tospeech is discussed, using neural networks to model the relationship between linguistic features extracted from input text and parameters representing the pitch contour of syllables.
Investigation of phonemic context in speech using self-organizing feature maps
- 1989
Computer Science
International Conference on Acoustics, Speech, and Signal Processing,
The authors have shown for their database that the sequence of responding units is consistent and similar for isolated utterances of the same word and distinct for different words, and propose an algorithm for sequence smoothing.
GENERALIZATION IN NEURAL SPEECH SYNTHESIS
- 1998
Computer Science
This paper presents the initial results of an investigation to determine the amount of training data required to reach optimal generalization in neural speech synthesizers, through an empirical exploration of the number of training patterns on test set error.
Vowel synthesis using feed-forward neural networks
- 1994
Computer Science
Interestingly, neural networks with no hidden layer proved to be as capable of learning the mapping as those with a hidden layer, and a relationship predicting the result of a modified rhyme is derived.
Language-independent, neural network-based, text-to-phones conversion
- 2009
Computer Science
Neurocomputing
Prosody generation with a neural network: weighing the importance of input parameters
- 1997
Computer Science
1997 IEEE International Conference on Acoustics, Speech, and Signal Processing
The approach presented here tries to quantify the contribution of each input parameter by comparing the mean errors of networks trained with only one parameter each and by looking at the performance of a group of networks where each lacks one parameter.
A dynamical system model for generating fundamental frequency for speech synthesis
- 1999
Physics
IEEE Trans. Speech Audio Process.
A new approach to generation of two important cues to prosodic patterns-fundamental frequency (F/sub 0/) and energy contours-given symbolic prosodic labels and text with a dynamical system model.
A Language-Independent Neural Network-Based Speech Synthesizer
- 2007
Computer Science
An artificial speech synthesizer based on neural networks is being developed for application to deeply embedded systems for language-independent speech commands on hands-free interfaces and initial experimental results show the expected properties of language independence and in-system learning.
Neural network control for a cascade/parallel formant synthesizer
- 1990
Computer Science
International Conference on Acoustics, Speech, and Signal Processing
Neural network control of a cascade/parallel formant text-to-speech synthesizer model is investigated and results for the generation of the fundamental frequency contour using feedforward and sequential networks are shown.
4 References
Review of text-to-speech conversion for English.
- 1987
Physics
The Journal of the Acoustical Society of America
This review traces the early work on the development of speech synthesizers, discovery of minimal acoustic cues for phonetic contrasts, evolution of phonemic rule programs, incorporation of prosodic rules, and formulation of techniques for text analysis.
NETtalk: a parallel network that learns to read aloud
- 1988
Computer Science
NETtalk is an alternative approach that is based on an automated learning procedure for a parallel network of deterministic processing units that achieves good performance and generalizes to novel words.
Phonological Aspects of Speech Recognition,
- 1980
Trends in Speech Recognition,