Guangpu Huang

Learn More
This paper presents a deep neural network (DNN) approach to sentence boundary detection in broadcast news. We extract prosodic and lexical features at each inter-word position in the transcripts and learn a sequential classifier to label these positions as either boundary or non-boundary. This work is realized by a hybrid DNN-CRF (conditional random field)(More)
This paper shows that the integration of statistical and connectionist methods can greatly enhance human-computer interaction through speech. The research approach is inspired by recent advances in high performance automatic speech recognition (ASR) systems and neurocognitive researches of natural language understanding (NLU). And a modest hybrid(More)
This paper presents a method to improve a language model for a limited-resourced language using statistical machine translation from a related language to generate data for the target language. In this work, the machine translation model is trained on a corpus of parallel Mandarin-Cantonese subtitles and used to translate a large set of Mandarin(More)
In this paper, we examined the feasibility of articulatory phonetic inversion (API) conditioned on the auditory qualities for improved speech recognition. And we introduced an efficient data-driven heuristic learning algorithm to capture the articulatory-phonetic features (APFs) of English speech. Then we reported the performance of the combined auditory(More)
We describe a neural based articulatory phonetic inversion model to improve the recognition of the acoustically varying vowels and the syllable initial plosives. The model uses a set of continuous valued articulatory phonetic features (APFs) to explore the interactions between the motor control of articulators and the acoustic phonetic events. We(More)
This paper studies the dual aspects of speech recognition and synthesis using the consonant-vowel speech patterns. It examines the feasibility of bi-directional phonetic modeling, i.e., articulatory encoding and auditory decoding, in a neural based computational model supplemented with an interactive learning algorithm. Simulation results demonstrate that(More)
This paper presents a deep neural network-conditional random field (DNN-CRF) system with multi-view features for sentence unit detection on English broadcast news. We proposed a set of multi-view features extracted from the acoustic, articulatory, and linguistic domains, and used them together in the DNN-CRF model to predict the sentence boundaries. We(More)
  • 1