Vinh Van Nguyen

Learn More
Though phrase-based SMT has achieved high translation quality, it still lacks of generalization ability to capture word order differences between languages. In this paper we describe a general method for tree-to-string phrase-based SMT. We study how syntactic transformation is incorporated into phrase-based SMT and its effectiveness. We design syntactic(More)
In this paper, we present a Conditional Random Fields (CRFs) framework for the Clause Splitting problem. We adapt the CRFs model to this problem in order to use a very large sets of arbitrary, overlapping and non-independent features. In addition , we propose the use of rich linguistic information along with a new bottom-up dynamic algorithm for decoding to(More)
Vietnamese accentless texts exist on parallel with official vietnamese documents and play an important role in instant message, mobile SMS and online searching. Understanding correctly these texts is not simple because of the lexical ambiguity caused by the diversity in adding diacritics to a given accentless sequence. There have been some methods for(More)
Reordering is a major challenge in machine translation (MT) between two languages with significant differences in word order. In this paper, we present an approach to learn reordering rules as pre-processing step based on a dependency parser in phrase-based statistical machine translation (SMT) from Vietnamese to English. Dependency parser and(More)
Reordering is of essential importance for phrase based statistical machine translation (SMT). In this paper, we would like to present a new method of reordering in phrase based SMT. We inspired from (Xia and Mc-Cord, 2004) using preprocessing reordering approaches. We used shallow parsing and transformation rules to reorder the source sentence. The(More)
The paper presents a new method for reordering in phrase based statistical machine translation (PBMT). Our method is based on previous chunk-level reordering methods for PBMT. First, we parse the source language sentence to a chunk tree, according to the method developed by [16]. Second, we apply a series of transformation rules which are learnt(More)
  • 1