Minhui Pang

Learn More
The speech corpus often needs to be constructed frequently for the diversified speech synthesis. This paper discusses our efforts on construction of speech corpus automatically from broadcasting speech databases for trainable Text-To-Speech (TTS) system. We present a new framework of automatic speech corpus construction from broadcasting speech databases.(More)
This paper presents a method for a automatically constructed text corpus with limited text for speech synthesis system. It is to collect phonetically rich sentences with high coverage of phonetic contextual units but has a small text size. In this paper, we present a new greedy algorithm to select text from the mother text. The mother text is auto-loaded by(More)
Audio classification is an important preprocess to the audio data. However, lots of manual labeled data are needed for training models. In order to solve this problem, we evaluate a semi-supervised machine learning algorithm called co-training for content-based audio classification. The audio is divided into there classes: pure speech, pure music and speech(More)
Speech sentence is the input of automatic phonetic segmentation or transcription. This paper discusses our efforts on automatic speech sentence segmentation from multi-paragraph speech databases for building Text-To-Speech (TTS) system speech corpus automatically. We present a) a system of automatic speech sentence segmentation from broadcasting audio based(More)
  • 1