Learn More
In this paper, an objective qtumtitative quality measure is proposed to evaluate tile performance of machiue translation systems. The proposed method is to compare the raw translation output of an MT system with the final revised version lor the customers, and then compute the editing efforts required to convert the raw translation to the final version. In(More)
The character-based tagging approach is a dominant technique for Chinese word segmentation, and both discrimi-native and generative models can be adopted in that framework. However, generative and discriminative character based approaches are significantly different and complement each other. A simple joint model combining the character-based generative(More)
Since statistical machine translation (SMT) and translation memory (TM) complement each other in matched and unmatched regions, integrated models are proposed in this paper to incorporate TM information into phrase-based SMT. Unlike previous multi-stage pipeline approaches, which directly merge TM result into the final output, the proposed models refer to(More)
An automatic treebank conversion method is proposed in this paper to convert a treebank into another treebank. A new treebank associated with a different grammar can be generated automatically from the old one such that the information in the original treebank can be transformed to the new one and be shared among different research communities. The simple(More)
An unsupervised iterative approach for extracting a new lexicon (or unknown words) from a Chinese text corpus is proposed in this paper. Instead of using a non-iterative segmentation-merging-filtering-and-disambiguation approach, the proposed method iteratively integrates the contextual constraints (among word candidates) and a joint character association(More)
In natural language processing, ambiguity resolution is a central issue, and can be regarded as a preference assignment problem. In this paper, a Generalized Probabilistic Semantic Model (GPSM) is proposed for preference computation. An effective semantic tagging procedure is proposed for tagging semantic features. A semantic score function is derived based(More)
This paper provides a simple definition of robustness for filters and smoothers, and shows that a certain class of nonlinear filters and smoothers are robust according to the given definition. An extensive Monte Carlo study, employing Monte Carlo swindle techniques and two-situation deficiency plots, is used to establish a good choice of robustifying(More)