Current word alignment models for statistical machine translation do not address morphology beyond merely splitting words. We present a two-level alignment model that distinguishes between words and morphemes, in which we embed an IBM Model 1 inside an HMM based word alignment model. The model jointly induces word and morpheme alignments using an EM… (More)
We apply multi-rate HMMs, a tree struc-tured HMM model, to the word-alignment problem. Multi-rate HMMs allow us to model reordering at both the morpheme level and the word level in a hierarchical fashion. This approach leads to better machine translation results than a morpheme-aware model that does not explicitly model morpheme reordering.
All permutations of a two level embedding sentence in Turkish is analyzed, in order to develop an LTAG grammar that can account for Turkish long distance dependencies. The fact that Turkish allows only long distance topicalization and extraposition is shown to be connected to a condition-the coherence condition-that draws the boundary between the acceptable… (More)