Learn More
In this paper, we consider signals originated from a sequence of sources. More specically, the problems of segmenting such signals and relating the segments to their sources are addressed. This issue has wide applications in many elds. This report describes a resolution method that is based on an Ergodic Hidden Markov Model (HMM), in which each HMM state(More)
Our statistical machine translation system that uses large Japanese-English parallel sentences and long phrase tables is described. We collected 698,973 Japanese-English parallel sentences, and we used long phrase tables. Also, we utilized general tools for statistical machine translation , such as " Giza++ " [1], " moses " [2], and "(More)
We developed a two-stage machine translation (MT) system. The first stage consists of an automatically created pattern-based machine translation system, and the second stage consists of a standard statistical machine translation (SMT) system. For French-English machine translation, we first used a French-English pattern-based MT, and we obtained " English "(More)
We have developed a two-stage machine translation (MT) system. The first stage is a rule-based machine translation system. The second stage is a normal statistical machine translation system. For Chinese-English machine translation, first, we used a Chinese-English rule-based MT, and we obtained " ENGLISH " sentences from Chinese sentences. Second , we used(More)
Abstr act. A large-scale sentence pattern dictionary SP-dictionary for Japanese compound and complex sentences has been developed. The dictionary has been compiled based on the non-compositional language model. Sentences with 2 or 3 predicates are extracted from a Japanese-to-English parallel corpus of 1 million sentences, and the compositional constituents(More)
In this study, we paid attention to the reliability of phrase table. We have been used the phrase table using Och's method[2]. And this method sometimes generate completely wrong phrase tables. We found that such phrase table caused by long parallel sentences. Therefore, we removed these long parallel sentences from training data. Also, we utilized general(More)