Mihoko Kitamura

Learn More
This paper proposes a method of finding correspondences of arbitrary length word sequences in aligned parallel corpora of Japanese and English. Translation candidates of word sequences are evaluated by a similarity measure between the sequences defined by the co-occurrence frequency and independent frequency of the word sequences. The similarity measure is(More)
This paper describes an implementation of Collaborative Translation Environment 'Yakushite Net'. In 'Yakushite Net', Internet users collaborate in enhancing the dictionaries of their specialty fields, and the system thus improves and expands its accuracy and areas of translations. In the course of realization of this system, we encountered several technical(More)
This paper describes a comprehensive translation environment build on the Internet. This environment is designed not only to translate web pages but also to support translation work on the web. We first introduce a basic idea and implementation of this environment and then compare it to conventional machine translation (MT) systems available on the web and(More)
We participated in SLIR, BLIR(PLIR) and MLIR subtasks at the NTCIR-4 CLIR task. Our IR system can handle queries and documents in Chinese, English and Japanese. The system utilizes multiple language resources (bilingual dictionaries, parallel corpora and machine translation systems) for query translation. We adopted the pivot language approach for C-J and(More)
We demonstrate a web-based machine translation environment that can be improved in terms of accuracy and scope through online collaboration by users. The environment leverages the cooperative efforts of online users for the creation of highly accurate dictionaries, enabling people with deep knowledge of a particular subject to collaborate in the enhancement(More)
This paper presents ongoing research on automatic extraction of bilingual lexicon from English-Japanese parallel corpora. The main objective of this paper is to examine various N-gram models of generating translation units for bilingual lexicon extraction. Three N-gram models, a baseline model (Bound-length N-gram) and two new models (Chunk-bound N-gram and(More)
The quality of machine translation is strongly dependent on the quantity and quality of the translation knowledge available to the system. Constructing translation knowledge by hand has inherent limitations, which begs for techniques to construct translation knowledge automatically or semi-automatically, and to integrate this translation knowledge easily.(More)