Learn More
The HMM-based TTS can produce a highly intelligible and decent quality voice. However, sometimes the synthesized speech exhibits perceptibly annoying glitches due to F0 extraction errors in the training data and voiced/unvoiced swapping errors in F0 generation. In the conventional MSD based F0 modeling [10], the dual but incompatible two probabilistic(More)
Sinus histiocytosis with massive lymphadenopathy (SHML) or Rosai-Dorfman disease (RDD) is an uncommon but well-defined benign self-limited clinicopathological entity. It mainly involves lymph nodes. Extranodal involvement is seen in up to 43% of cases, with the most common location in the head and neck region. Primary RDD occurring in the bone is rare with(More)
Learning a second language is hard, especially when the learner's brain must be retrained to identify sounds not present in his or her native language. It also requires regular practice, but many learners struggle to find the time and motivation. Our solution is to break down the challenge of mastering a foreign sound system into minute-long episodes of(More)
Multiple input multiple output (MIMO) systems that use antenna arrays at both the transmitter and receiver are gaining much more attention and efforts in wireless communication research due to their potential to increase considerably capacity in mobile cellular communications. However, in the real propagation environment of cellular communications, it is(More)
Two categories of Confidence Measure (CM) approaches for Mandarin command word recognition, i.e., Likelihood Ratio Testing (LRT) based CM and Word Posterior Probability (WPP) based CM, are investigated in this paper. Both Equal Error Rate (EER) and Confidence Error Rate (CER) performances of these approaches are evaluated on two databases: A Mandarin(More)
Recently, the speaker code based adaptation has been successfully expanded to recurrent neural networks using bidirectional Long Short-Term Memory (BLSTM-RNN) [1]. Experiments on the small-scale TIMIT task have demonstrated that the speaker code based adaptation is also valid for BLSTM-RNN. In this paper, we evaluate this method on large-scale task and(More)
Recently, several fast speaker adaptation methods have been proposed for the hybrid DNN-HMM models based on the so-called discriminative speaker codes (SC) [1-3] and applied to unsupervised speaker adaptation in speech recognition [4]. It has been demonstrated that the SC based methods are quite effective in adapting DNNs even when only a very small amount(More)
We propose to train Hidden Markov Model (HMM) by allocating Gaussian kernels non-uniformly across states so as to optimize a selected discriminative training criterion. The optimal kernel allocation problem is first formulated based upon a non-discriminative, Maximum Likelihood (ML) criterion and then generalized to incorporate discrimi-native ones. An(More)
Stress-induced viscous flow is the characteristic of atomic movements during plastic deformation of metallic glasses in the absence of substantial temperature increase, which suggests that stress state plays an important role in mechanically induced crystallization in a metallic glass. However, it is poorly understood. Here, we report on the stress-induced(More)
Bidirectional long short-term memory (BLSTM) recurrent neural networks are powerful acoustic models in terms of recognition accuracy. When BLSTM acoustic models are used in decoding, the speech decoder needs to wait until the end of a whole sentence is reached, such that forward-propagation in the backward direction can then be performed. The nature of(More)
  • 1