Yonghong Yan

Learn More
How to construct models for speech/nonspeech discrimination is a crucial point for voice activity detectors (VADs). Semi-supervised learning is the most popular way for model construction in conventional VADs. In this correspondence, we propose an unsupervised learning framework to construct statistical models for VAD. This framework is realized by a(More)
A set of freely available, universal speech tools is needed to accelerate progress in the speech technology. The CSLU Toolkit represents an effort to make the core technology and fundamental infrastructure accessible, affordable and easy to use. The CSLU Toolkit has been under development for five years. This paper describes recent improvements, additions(More)
Three perceptual experiments were conducted to test the relative importance of vowels vs. consonants to recognition of uent speech. Sentences were selected from the TIMIT corpus to obtain approximately equal numbers of vowels and consonants within each sentence and equal durations across the set of sentences. In experiments 1 and 2, subjects listened to (a)(More)
In this paper, a novel frame-based algorithm called recursive alignment(RA) for query-by-humming(QBH) application is presented. Compared with other approaches, RA optimizes melody alignment problems in a top-down fashion which is more capable of capturing longdistance information in human singing. Three RA variations which run much faster at the expense of(More)
Neural network training targets for speech recognition are estimated using a novel method. Rather than use zero and one, continuous targets are generated using forwardbackward probabilities. Each training pattern has more than one class active. Experiments showed that the new method e ectively decreased the error rate by 15% in a continuous digits(More)
Acoustic features of vocal tract function are used widely in the study of pathological voices detection. Classification of normal and pathological voices by acoustic parameters is a useful way to diagnose voice diseases. In this aspect, mel-frequency cepstral coefficients are proved to be effective with traditional classifiers such as Gaussian Mixture Model(More)
Despite a good understanding of the process that initiates and promotes host inflammation induced by acute injury, little is known about the host immune cells responsible for the inhibition of inflammatory response to thermal injury. The aim of this study was to investigate the potential effect of naturally existing CD11c(low)CD45RB(high) dendritic cells(More)
This paper presents our approach to dialog state tracking for the Dialog State Tracking Challenge task. In our approach we use discriminative general structured conditional random fields, instead of traditional generative directed graphic models, to incorporate arbitrary overlapping features. Our approach outperforms the simple 1-best tracking approach.
Recurrent Neural Network Language Model (RNNLM) has recently been shown to outperform N-gram Language Models (LM) as well as many other competing advanced LM techniques. However, the training and testing of RNNLM are very time-consuming, so in real-time recognition systems, RNNLM is usually used for re-scoring a limited size of n-best list. In this paper,(More)