Online Large-Margin Training for Statistical Machine Translation
- Taro Watanabe, Jun Suzuki, Hajime Tsukada, Hideki Isozaki
- Computer ScienceConference on Empirical Methods in Natural…
- 1 June 2007
Experiments on Arabic-toEnglish translation indicated that a model trained with sparse binary features outperformed a conventional SMT system with a small number of features.
Inducing a Discriminative Parser to Optimize Machine Translation Reordering
- Graham Neubig, Taro Watanabe, Shinsuke Mori
- Computer ScienceConference on Empirical Methods in Natural…
- 12 July 2012
This paper proposes a method for learning a discriminative parser for machine translation reordering using only aligned parallel text. This is done by treating the parser's derivation tree as a…
An Unsupervised Model for Joint Phrase Alignment and Extraction
- Graham Neubig, Taro Watanabe, E. Sumita, Shinsuke Mori, Tatsuya Kawahara
- Computer ScienceAnnual Meeting of the Association for…
- 19 June 2011
An unsupervised model for joint phrase alignment and extraction using non-parametric Bayesian methods and inversion transduction grammars (ITGs) is presented, which matches the accuracy of traditional two-step word alignment/phrase extraction approach while reducing the phrase table to a fraction of the original size.
Denoising Neural Machine Translation Training with Trusted Data and Online Data Selection
- Wei Wang, Taro Watanabe, Macduff Hughes, Tetsuji Nakagawa, Ciprian Chelba
- Computer ScienceConference on Machine Translation
- 31 August 2018
Methods for measuring and selecting data for domain MT and applies them to denoising NMT training show its significant effectiveness for NMT to train on data with severe noise.
Bilingual Lexicon Extraction from Comparable Corpora Using Label Propagation
- Akihiro Tamura, Taro Watanabe, E. Sumita
- Computer ScienceConference on Empirical Methods in Natural…
- 12 July 2012
A novel method for lexicon extraction that extracts translation pairs from comparable corpora by using graph-based label propagation and achieves improved performance by clustering synonyms into the same translation.
Reordering Constraints for Phrase-Based Statistical Machine Translation
- R. Zens, H. Ney, Taro Watanabe, E. Sumita
- Computer ScienceInternational Conference on Computational…
- 23 August 2004
This work investigates different reordering constraints for phrase-based statistical machine translation, namely the IBM constraints and the ITG constraints and presents efficient dynamic programming algorithms for both constraints.
Transition-based Neural Constituent Parsing
- Taro Watanabe, E. Sumita
- Computer ScienceAnnual Meeting of the Association for…
- 1 July 2015
This work proposes a neural network structure that explicitly models the unbounded history of actions performed on the stack and queue employed in transition-based parsing, in addition to the representations of partially parsed tree structure.
Left-to-Right Target Generation for Hierarchical Phrase-Based Translation
- Taro Watanabe, Hajime Tsukada, Hideki Isozaki
- Computer ScienceAnnual Meeting of the Association for…
- 17 July 2006
A hierarchical phrase-based statistical machine translation in which a target sentence is efficiently generated in left-to-right order, which enables a straightforward integration with ngram language models.
Recurrent Neural Networks for Word Alignment Model
- Akihiro Tamura, Taro Watanabe, E. Sumita
- Computer ScienceAnnual Meeting of the Association for…
- 1 June 2014
A word alignment model based on a recurrent neural network (RNN), in which an unlimited alignment history is represented by recurrently connected hidden layers, which outperforms the feed-forward neural network-based model as well as the IBM Model 4 under Japanese-English and French-English word alignment tasks.
Language Model Adaptation with Additional Text Generated by Machine Translation
- Hideharu Nakajima, H. Yamamoto, Taro Watanabe
- Computer ScienceInternational Conference on Computational…
- 24 August 2002
This paper proposes a novel scheme that generates a small target corpus in the language of the model by machine translation of the target Corpus in another language, and shows that the language model improvement was about half of that which was obtained with a human collected corpus.
...
...