Zhengxian Gong

Learn More
Statistical machine translation systems are usually trained on a large amount of bilingual sentence pairs and translate one sentence at a time, ignoring document-level information. In this paper, we propose a cache-based approach to document-level translation. Since caches mainly depend on relevant data to supervise subsequent decisions, it is critical to(More)
Topic modeling is a popular framework to analyze large text collections. In the previous work, employing topic modeling into statistic machine translation mainly depends on one major topic of the test document. Different from the previous work, the proposed approaches will coverage not only major topic but also sub-topics. The basic idea of this paper is(More)
Tense is a small element to a sentence, however, error tense can raise odd grammars and result in misunderstanding. Recently, tense has drawn attention in many natural language processing applications. However, most of current Statistical Machine Translation (SMT) systems mainly depend on translation model and language model. They never consider and make(More)
Information service in the grid provides the ability to discover and monitor resources which is fundamental for the grid infrastructure. A framework of a tree-based grid information service (TGIS) is proposed in this paper. This framework is based on GT's MDS. MDS uses centralized GIIS to index services. Performance study shows that GIIS can not sustain(More)
Currently, there are many researches focusing on grid scheduling and more and more scheduling algorithms were proposed. However, those algorithms are not satisfied with the requirement of the grid for ignoring its characteristics of dynamics, autonomy, distributing, etc. Therefore, this paper proposes an adaptable dynamic job scheduling approach based on(More)
Document-level Machine Translation (MT) has been drawing more and more attention due to its potential of resolving sentencelevel ambiguities and inconsistencies with the benefit of wide-range context. However, the lack of simple yet effective evaluation metrics largely impedes the development of such document-level MT systems. This paper proposes to improve(More)
Current Statistical Machine Translation (SMT) is significantly affected by Machine Translation (MT) evaluation metric. Nowadays the emergence of document-level MT research increases the demand for corresponding evaluation metric. This paper proposes two superior yet low-cost quantitative objective methods to enhance traditional MT metric by modeling(More)
In gender classification, labeled data is often limited while unlabeled data is ample. This motivates semi-supervised learning for gender classification to improve the performance by exploring the knowledge in both labeled and unlabeled data. In this paper, we propose a semi-supervised approach to gender classification by leveraging textual features and a(More)