Learn More
This is a non-technical paper describing how and why we organized BEST 2009, the first contest in the series of “Benchmark for Enhancing the Standard of Thai language processing”, which is expected to help accelerate the progress of the Natural Language Processing technology in Thailand by assembling 3 essential components: common standards,(More)
This paper proposes a work on phonetically balanced sentence (PB) and phonetically distributed sentence (PD) set, which are parts of the text prompt for speech recording in Large Vocabulary Continuous Speech Recognition (LVCSR) corpus for Thai language. Firstly, a protocol of Thai phonetic transcription and some essential rules of phonetic correction after(More)
This paper proposes a new environmental noise classification using principal component analysis (PCA) for robust speech recognition. Once the type of noise is identified, speech recognition performance can be enhanced by selecting the identified noise specific acoustic model. The proposed model applies PCA to a set of noise features, and results from PCA(More)
Traditional language models rely on lexical units that are dened as entities separated from each other by word boundary markers. Since there are no such boundaries in Thai, alternative denitions of lexical units have to be pursued. The problem is to nd the optimal set of lexical units that constitutes the vocabulary of the language model and yields the best(More)
This article explains the history of Thai language development for computers, examining such factors as the language, script, and writing system, among others. The article also analyzes characteristics of Thai characters and I/O methods, and addresses key issues involved in Thai text processing. Finally, the article reports on language processing research(More)
This paper presents new evidence about user perception of VoIP quality that is inconsistent with the general understanding of three codecs know as G.729, G.711 and G.722. The focus of the study is aimed at VoIP quality evaluation by Thai users that use the Thai language which is tonal. This study was conducted by using conversation-opinion tests. The(More)
Large speech and text corpora are crucial to the development of a state-of-the-art speech recognition system. This paper reports on the construction and evaluation of the first Thai broadcast news speech and text corpora. Specifications and conventions used in the transcription process are described in the paper. The speech corpus contains about 17 hours of(More)
This paper presents a neural network based text-dependent speaker identification system for Thai language. Linear Prediction Coefficients (LPC) are extracted from speech signal and formed feature vectors. These features are fed into multilayer perceptron (MLP) neural network with backpropagation learning algorithm for training and identification processes.(More)