Controllable Neural Prosody Synthesis

  author={Max Morrison and Zeyu Jin and Justin Salamon and Nicholas J. Bryan and Gautham J. Mysore},
Speech synthesis has recently seen significant improvements in fidelity, driven by the advent of neural vocoders and neural prosody generators. However, these systems lack intuitive user controls over prosody, making them unable to rectify prosody errors (e.g., misplaced emphases and contextually inappropriate emotions) or generate prosodies with diverse speaker excitement levels and emotions. We address these limitations with a user-controllable, context-aware neural prosody generator. Given a… 

