Merlin: An Open Source Neural Network Speech Synthesis System
The Merlin speech synthesis toolkit for neural network-based speech synthesis takes linguistic features as input, and employs neural networks to predict acoustic features, which are then passed to a vocoder to produce the speech waveform.
Detection of phonological features in continuous speech using neural networks
  • S. King, P. Taylor
  • Computer Science, Linguistics
    Comput. Speech Lang.
  • 1 October 2000
This paper reports experiments on three phonological feature systems: the Sound Pattern of English (SPE) system, amulti-valued (MV) feature system which uses traditional phonetic categories such as manner, place, etc., and government Phonology which uses a set of structured primes.
Deep neural networks employing Multi-Task Learning and stacked bottleneck features for speech synthesis
It is shown that the hidden representation used within a DNN can be improved through the use of Multi-Task Learning, and that stacking multiple frames of hidden layer activations (stacked bottleneck features) also leads to improvements.
Announcing the Electromagnetic Articulography (Day 1) Subset of the mngu0 Articulatory Corpus
This paper both outlines the general goals motivating the distribution of the data and the creation of the mngu0 web forum, and provides a description of the EMA data contained in this initial release.
A study of speaker adaptation for DNN-based speech synthesis
An experimental analysis of speaker adaptation for DNN-based speech synthesis at different levels and systematically analyse the performance of each individual adaptation technique and that of their combinations.
Speech production knowledge in automatic speech recognition.
A survey of a growing body of work in which representations of speech production are used to improve automatic speech recognition is provided.
Festival 2 - build your own general purpose unit selection speech synthesiser
This paper describes version 2 of the Festival speech synthesis system. Festival 2 provides a development environment for concatenative speech synthesis, and now includes a general purpose unit