Choochart Haruechaiyasak

Learn More
In this paper, we analyze and compare various approaches for Thai word segmentation. The word segmentation approaches could be classified into two distinct types, dictionary based (DCB) and machine learning based (MLB). The DCB approach relies on a set of terms for parsing and segmenting input texts. Whereas the MLB approach relies on a model trained from a(More)
In this paper, a method of automatically classifying Web documents into a set of categories using the fuzzy association concept is proposed. Using the same word or vocabulary to describe different entities creates ambiguity, especially in the Web environment where the user population is large. To solve this problem, fuzzy association is used to capture the(More)
Traditional approaches for studying consumer behavior, such as marketing survey and focus group, require a large amount of time and resources. Moreover, some products, such as smartphones, have a short product life cycle. As an alternative solution, we propose a system, the Micro-blog Sentiment Analysis System (MSAS), based on sentiment analysis to(More)
Terrorism has led to many problems in Thai societies, not only property damage but also civilian casualties. Predicting terrorism activities in advance can help prepare and manage risk from sabotage by these activities. This paper proposes a framework focusing on event classification in terrorism domain using fuzzy inference systems (FISs). Each FIS is a(More)
Fuzzy ontology is based on the concept that each index object is related to every other object in the ontology, with a degree of membership assigned to that relationship based on fuzzy set theory. This paper proposes use cases based on the related process of the terrorism event extraction using fuzzy ontology, especially the terrorism fuzzy ontology(More)
We propose a feature called category browsing to enhance the full-text search function of Thai-language news article search engine. The category browsing allows users to browse and filter search results based on some predefined categories. To implement the category browsing feature, we applied and compared among several text categorization algorithms(More)
Thai language is considered as an unsegmented language in which words are written continuously without the use of word delimiters. To index Thai texts via the inverted index, a word segmentation algorithm is usually required to tokenize a text into a series of terms. Recent works on word segmentation reported Conditional Random Fields (CRFs) as the best(More)
In this paper, we propose a Thai language specific Web crawling as a method of selectively seek out Web pages written in Thai. The strategy is to follow a URL with the highest probability of leading to Thai Web pages. The probability score is calculated from the example set of Web pages using simple Naive Bayes approach. In addition, we also use a heuristic(More)