Tung-Hui Chiang

Learn More
In a Chinese sentence, there are no word delimiters, like blanks, between the “words”. Therefore, it is important to identify the word boundaries before processing Chinese text. Traditional approaches tend to use dictionary lookup, morphological rules and heuristics to identify the word boundaries. Such approaches may not be applied to a large system due to(More)
In this paper, a discrimination and robusmess oriented adaptive learning procedure is proposed to deal with the task of syntactic ambiguity resolution. Owing to the problem of insufficient training data and approximation error introduced by the language model, traditional statistical approaches, which resolve ambiguities by indirectly and implicitly using(More)
Statistical approaches to natural language processing generally obtain the parameters by using the maximum likelihood estimation (MLE ) method. The MLE approaches, however, may fail to achieve good performance in difficult tasks, because the discrimination and robustness issues are not taken into consideration in the estimation processes. Motivated by that(More)
In some dialogue systems, the design of dialogue strategy is bound tightly to the domain by straightforwardly hard-coding the response actions into the system. Such a paradigm is quite easy to build up a prototype system, but makes it difficult to port the system across different domains. This paper presents a domain-transparent design of dialogue(More)
Statistical NLP models usually only consider coarse information and very restricted context to make the estimation of parameters feasible. To reduce the modeling error introduced by a simplified probabilistic model, the Classitication and Regression Tree (CART) method was adopted in this paper to select more discriminative features for automatic model(More)
A Corpus-Based Statistics-Oriented (CBSO) methodology, which is an attempt to avoid the drawbacks of traditional rule-based approaches and purely statistical approaches, is introduced in this paper. Rule-based approaches, with rules induced by human experts, had been the dominant paradigm in the natural language processing community. Such approaches,(More)