Chai Wutiwiwatchai

Learn More
Large speech and text corpora are crucial to the development of a state-of-the-art speech recognition system. This paper reports on the construction and evaluation of the first Thai broadcast news speech and text corpora. Specifications and conventions used in the transcription process are described in the paper. The speech corpus contains about 17 hours of(More)
This paper proposes a technique of automatic syllable-pattern induction in statistical Thai text-to-phone transcription. A general process of building a statistical text-to-phone transcription is to first define a set of rules describing syllable patterns, which is used for syllabification. Given an input text, the syllabification process generates all(More)
This paper presents applications of five famous learning methods for Thai phrase break prediction. Phrase break prediction is particularly important for our Thai text-to-speech synthesizer (TTS), where input Thai text has no word and sentence boundary. The learning methods include a POS sequence model, CART, RIPPER, SLIPPER and neural network. Features(More)
This paper proposes an efficient acoustic model adaptation method based on the use of simulated-data in maximum likelihood linear regression (MLLR) adaptation for robust speech recognition. Online MLLR adaptation is an unsupervised process which requires an input speech with phone labels transcribed automatically. Instead of using only the input signal in(More)
This is a non-technical paper describing how and why we organized BEST 2009, the first contest in the series of “Benchmark for Enhancing the Standard of Thai language processing”, which is expected to help accelerate the progress of the Natural Language Processing technology in Thailand by assembling 3 essential components: common standards,(More)
This paper presents new evidence about user perception of VoIP quality that is inconsistent with the general understanding of three codecs know as G.729, G.711 and G.722. The focus of the study is aimed at VoIP quality evaluation by Thai users that use the Thai language which is tonal. This study was conducted by using conversation-opinion tests. The(More)
This article tackles the problem of transcribing English words using Thai phonological system. The problem exists in Thai, where modern writing often composes of English orthography, and transcribing using English phonology results unnatural. The proposed model is totally data-driven, starting by automatic grapheme-phoneme alignment, modeling transduction(More)
Traditional language models rely on lexical units that are de ned as entities separated from each other by word boundary markers. Since there are no such boundaries in Thai, alternative de nitions of lexical units have to be pursued. The problem is to nd the optimal set of lexical units that constitutes the vocabulary of the language model and yields the(More)
This paper proposes a work on phonetically balanced sentence (PB) and phonetically distributed sentence (PD) set, which are parts of the text prompt for speech recording in Large Vocabulary Continuous Speech Recognition (LVCSR) corpus for Thai language. Firstly, a protocol of Thai phonetic transcription and some essential rules of phonetic correction after(More)