Topic-based term translation models for statistical machine translation

  title={Topic-based term translation models for statistical machine translation},
  author={Deyi Xiong and Fandong Meng and Qun Liu},
  journal={Artif. Intell.},
A system for terminology extraction and translation equivalent detection in real time
In this paper we present a system for automatic terminology extraction and automatic detection of the equivalent terms in the target language to be used alongside a computer assisted translation
Networked Artificial Intelligence English Translation System Based on an Intelligent Knowledge Base and Translation Method Thereof
The principle of intelligent knowledge-based translation and the advantage of this translation method compared with the traditional translation method based on lexical structure analysis are explained and the effect of the smoothing algorithm is verified, which verifies the effectiveness of the system.
Machine translation of domain-specific expressions within ontologies and documents
This dissertation examines the translation of domain-specific expressions represented in semantically structured resources or documents and presents a domain-aware machine translation system to automatically translate the labels.
Detection of Verbal Multi-Word Expressions via Conditional Random Fields with Syntactic Dependency Features and Semantic Re-Ranking
An option to re-rank the 10 best CRF-predicted sequences via semantic vectors, boosting its scores above other systems in the competition and arguing for a more purpose-specific evaluation scheme is presented.
Development of Automatic English Translation System Based on Fuzzy Matching and Software Simulation
  • Q. Tan
  • Computer Science
    Mobile Information Systems
  • 2022
A translation algorithm that takes the changes in software requirements as the basis for an automatic English translation system based on software change management and 5G networks will further enhance the intelligence and automation of English translation.
Cooperative Hierarchical Dirichlet Processes: Superposition vs. Maximization
Leveraging bilingual terminology to improve machine translation in a CAT environment*
This work evaluates the proposed framework that, taking as input a small set of parallel documents, gathers domain-specific bilingual terms and injects them into an SMT system to enhance translation quality and compares two terminology injection methods that can be easily used at run-time without altering the normal activity of anSMT system.


Modeling Term Translation for Document-informed Machine Translation
This paper investigates three issues of term translation in the context of documentinformed SMT and proposes three corresponding models that can achieve significant improvements over the baseline and evaluates their effectiveness on NIST ChineseEnglish translation tasks with large-scale training data.
Topic-Based Coherence Modeling for Statistical Machine Translation
  • Deyi Xiong, Min Zhang, Xing Wang
  • Computer Science, Physics
    IEEE/ACM Transactions on Audio, Speech, and Language Processing
  • 2015
This paper proposes topic-based coherence models to produce coherence for document translation, in terms of the continuity of sentence topics in a text, and integrates them into a state-of-the-art phrase-based machine translation system.
A Topic-Based Coherence Model for Statistical Machine Translation
This paper proposes a topic-based coherence model to produce coherence for document translation, in terms of the continuity of sentence topics in a text, and adopts a maximum entropy classifier to predict the target coherence chain that defines a linear topic structure for the target document.
Translation Model Adaptation for Statistical Machine Translation with Monolingual Topic Information
This paper proposes a novel approach for translation model adaptation by utilizing in-domain monolingual topic information instead of the in- domain bilingual corpora, which incorporates the topic information into translation probability estimation.
Statistical Phrase-Based Translation
The empirical results suggest that the highest levels of performance can be obtained through relatively simple means: heuristic learning of phrase translations from word-based alignments and lexical weighting of phrase translation.
A Topic Similarity Model for Hierarchical Phrase-based Translation
This work proposes a topic similarity model to exploit topic information at the synchronous rule level for hierarchical phrase-based translation, and shows that this model significantly improves the translation performance over the baseline on NIST Chinese-to-English translation experiments.
Dynamic Topic Adaptation for Phrase-based MT
This work explores topic adaptation on a diverse data set and presents a new bilingual variant of Latent Dirichlet Allocation to compute topic-adapted, probabilistic phrase translation features, and dynamically infer document-specific translation probabilities for test sets of unknown origin.
Modeling Lexical Cohesion for Document-Level Machine Translation
Three different models to capture lexical cohesion for document-level machine translation are proposed, including a direct reward model, a conditional probability model, and a mutual information trigger model, which show that all three models can achieve substantial improvements over the baseline.
Enhancing statistical machine translation with bilingual terminology in a CAT environment
This paper develops a framework that, taking as input a small amount of parallel in-domain data, gathers domain-specific bilingual terms and injects them in an SMT system to enhance the translation productivity.
Post-MT Term Swapper: Supplementing a Statistical Machine Translation System with a User Dictionary
A way to identify terminology translations from MT output and automatically swap them with user-defined translations is proposed, which can be applied to any type of MT system and which shows high coverage and positive impact on the overall MT quality.