Vinh Van Nguyen

Learn More
Though phrase-based SMT has achieved high translation quality, it still lacks of generalization ability to capture word order differences between languages. In this paper we describe a general method for tree-to-string phrase-based SMT. We study how syntactic transformation is incorporated into phrase-based SMT and its effectiveness. We design syntactic(More)
In this paper, we present a Conditional Random Fields (CRFs) framework for the Clause Splitting problem. We adapt the CRFs model to this problem in order to use a very large sets of arbitrary, overlapping and non-independent features. In addition , we propose the use of rich linguistic information along with a new bottom-up dynamic algorithm for decoding to(More)
Vietnamese accentless texts exist on parallel with official vietnamese documents and play an important role in instant message, mobile SMS and online searching. Understanding correctly these texts is not simple because of the lexical ambiguity caused by the diversity in adding diacritics to a given accentless sequence. There have been some methods for(More)
Reordering is of essential importance for phrase based statistical machine translation (SMT). In this paper, we would like to present a new method of reordering in phrase based SMT. We inspired from (Xia and Mc-Cord, 2004) using preprocessing reordering approaches. We used shallow parsing and transformation rules to reorder the source sentence. The(More)
  • 1