Bartholomäus Wloka

Learn More
We present a method for improving existing statistical machine translation methods using an knowledge-base compiled from a bilingual corpus as well as sequence alignment and pattern matching techniques from the area of machine learning and bioinformatics. An alignment algorithm identifies similar sentences, which are then used to construct a better word(More)
In this paper we introduce a comprehensive framework for a ubiquitous translation and language learning environment utilizing the capabilities of modern cell phone technology. We present a partial first realization of this framework: an application for learning Japanese characters and Japanese-English translation; the latter is based on results from our(More)
In this article, we describe our experience with deploying our previous work on machine translation and language learning on mobile Internet platforms, i.e. smartphones. We present KANTEAM – KAnji TEAcher Mobile, implemented on the Samsung Galaxy Tab with Google's operating system Android, and UTROLL – Ubiquitous Translation and Language Learning(More)
Collaborative research, preservation of data, common access to information and recently an increasingly growing amount of data have been an issue in disciplines in which the studied material is located in many places. This is especially the case in the humanities. Digital libraries are a step towards turning hard copies into digitally available material.(More)
Bilingual corpora play an important role as resources not only for machine translation research and development but also for studying tasks in comparative linguistics. Manual annotation of word alignments is of significance to provide a gold-standard for developing and evaluating machine translation models and comparative linguistics tasks. This paper(More)
In this paper we present a workflow for the automated creation of parallel domain-specific corpora, i.e. multilingual translated text collections of a certain domain, in which the text pieces are aligned at sentence level. The source for the text extraction are Wikipedia articles. This workflow will be adaptable to any language pair, though the first(More)
  • 1