Merlin: An Open Source Neural Network Speech Synthesis System

  title={Merlin: An Open Source Neural Network Speech Synthesis System},
  author={Zhizheng Wu and O. Watts and Simon King},
  • Zhizheng Wu, O. Watts, Simon King
  • Published in SSW 2016
  • Computer Science
  • We introduce the Merlin speech synthesis toolkit for neural network-based speech synthesis. The system takes linguistic features as input, and employs neural networks to predict acoustic features, which are then passed to a vocoder to produce the speech waveform. Various neural network architectures are implemented, including a standard feedforward neural network, mixture density neural network, recurrent neural network (RNN), long short-term memory (LSTM) recurrent neural network, amongst… CONTINUE READING
    242 Citations
    Char2Wav: End-to-End Speech Synthesis
    • 281
    • PDF
    Research on Dungan speech synthesis based on Deep Neural Network
    • L. Chen, Hongwu Yang, H. Wang
    • Computer Science
    • 2018 11th International Symposium on Chinese Spoken Language Processing (ISCSLP)
    • 2018
    Word level prosody prediction using large audiobook dataset
    Lombard speech synthesis using long short-term memory recurrent neural networks
    • 8
    • PDF
    Rapid development of new TTS voices by neural network adaptation
    • 3
    Deep Recurrent Neural Networks in Speech Synthesis Using a Continuous Vocoder
    • 8
    • PDF
    Neural speech synthesis for resource-scarce languages
    • 1
    • PDF
    Development of Assamese Text-to-speech System using Deep Neural Network
    • 2
    • PDF
    A Modularized Neural Network with Language-Specific Output Layers for Cross-Lingual Voice Conversion
    • 4
    • PDF


    Investigating gated recurrent neural networks for speech synthesis
    • 23
    • PDF
    Deep neural network-guided unit selection synthesis
    • 45
    • PDF
    Deep neural networks employing Multi-Task Learning and stacked bottleneck features for speech synthesis
    • 229
    • PDF
    Unidirectional long short-term memory recurrent neural network with recurrent output layer for low-latency speech synthesis
    • H. Zen, H. Sak
    • Computer Science
    • 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
    • 2015
    • 251
    • PDF
    Investigating gated recurrent networks for speech synthesis
    • Zhizheng Wu, Simon King
    • Computer Science
    • 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
    • 2016
    • 107
    • PDF
    Speech synthesis with artificial neural networks
    • 15
    Statistical parametric speech synthesis: from HMM to LSTM-RNN
    • 42
    • Highly Influential
    • PDF
    A high quality text-to-speech system composed of multiple neural networks
    • 20
    • PDF
    TTS synthesis with bidirectional LSTM based recurrent neural networks
    • 385
    • PDF
    Modelling acoustic feature dependencies with artificial neural networks: Trajectory-RNADE
    • 38
    • PDF