Aaron B. Phillips

Learn More
The Cunei machine translation platform is an open-source system for data-driven machine translation. Our platform is a synthesis of the traditional example-based MT (EBMT) and statistical MT (SMT) paradigms. What makes Cunei unique is that it measures the relevance of each translation instance with a distance function. This distance function, represented as(More)
Example Based Machine Translation (EBMT) is limited by the quantity and scope of its training data. Even with a reasonably large corpus, we will not have examples that cover everything we want to translate. This problem is especially severe in Arabic due to its rich morphology. We demonstrate a novel method that exploits the regular nature of Arabic(More)
The Cunei Machine Translation Platform is an open-source MT system designed to model instances of translation. One of the challenges to this approach is effective training. We describe two techniques that improve the training procedure and allow us to leverage the strengths of instance-based modeling. First, during training we approximate our model with a(More)
Machine translation has advanced considerably in recent years, primarily due to the availability of larger datasets. However, one cannot rely on the availability of copious, high-quality bilingual training data. In this work, we improve upon the state-of-the-art in machine translation with an instance-based model that scores each instance of translation in(More)
Acknowledgment: This work is supported, in part, by the Human Language Technology Center of Excellence. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of the sponsor.
Machine translation has advanced considerably in recent years, but primarily due to the availability of larger data sets. Translation of low-frequency phrases and resourcepoor languages is still a serious problem. In this work we explore a deeper integration of context, structure, and similarity within machine translation. Instead of modeling phrase pairs(More)