Analysis of voice fundamental frequency contours for declarative sentences of Japanese
Analysis of natural utterances of various declarative sentences of Japanese revealed that the model can generate close approximations to observed F0 contours from a set of discrete commands and a small number of parameters.
WFST-Based Grapheme-to-Phoneme Conversion: Open Source tools for Alignment, Model-Building and Decoding
This paper introduces a new open source, WFST-based toolkit for Grapheme-toPhoneme conversion. The toolkit is efficient, accurate and currently supports a range of features including EM sequence
Phonetisaurus: Exploring grapheme-to-phoneme conversion with joint n-gram models in the WFST framework
Phonetisaurus is introduced, a fully-functional, flexible, open-source, BSD-licensed G2P conversion toolkit, which leverages the OpenFst library and achieves new state-of-the-art performance via ensemble methods combining RnnLMs and n-gram based models.
Failure transitions for joint n-gram models and G2p conversion
Two novel algorithms are proposed, which extend the work from [1] and enable the use of failure-transitions with joint n-gram models via the WFST framework, and are available as part of the open-source, BSD-licensed Phonetisaurus G2P toolkit.
Improving WFST-based G2P Conversion with Alignment Constraints and RNNLM N-best Rescoring
This work introduces a modified WFST-based multiple to multiple EM-driven alignment algorithm for Graphemeto-Phoneme (G2P) conversion, and preliminary experimental results applying a Recurrent Neural
Single-Mixture Audio Source Separation by Subspace Decomposition of Hilbert Spectrum
A novel technique is developed to separate the audio sources from a single mixture based on decomposing the Hilbert spectrum of the mixed signal into independent source subspaces and the inverse transformation is applied.