Phrase translation using a bilingual dictionary and n-gram data: A case study from Vietnamese to English

  title={Phrase translation using a bilingual dictionary and n-gram data: A case study from Vietnamese to English},
  author={Khang Nhut Lam and Feras Al Tarouti and Jugal Kumar Kalita},
Past approaches to translate a phrase in a language L1 to a language L2 using a dictionarybased approach require grammar rules to restructure initial translations. This paper introduces a novel method without using any grammar rules to translate a given phrase in L1, which does not exist in the dictionary, to L2. We require at least one L1‐L2 bilingual dictionary and n-gram data in L2. The average manual evaluation score of our translations is 4.29/5.00, which implies very high quality. 

Tables from this paper



Translation selection for Japanese-English noun-noun compounds

This work presents a method for compositionally translating Japanese NN compounds into English, using a word-level transfer dictionary and target language monolingual corpus, and demonstrates that interpolation over the two data types is superior to using either one.

Phrasal Transfer Model for Vietnamese-english Machine Translation

Morphological diierences between Viet-namese and English not only aaect to translation process in morphological level but also strongly change the structure of translated sentence. Actually, in

Identifying bilingual Multi-Word Expressions for Statistical Machine Translation

A strategy for detecting translation pairs of MWEs in a French-English parallel corpus and three methods aiming to integrate extracted bilingual MWE S in M OSES, a phrase based Statistical Machine Translation (SMT) system are described.

N-gram-based Machine Translation

This article describes in detail an n-gram approach to statistical machine translation. This approach consists of a log-linear combination of a translation model based on n-grams of bilingual units,

Feature-Rich Statistical Translation of Noun Phrases

A dedicated noun phrase translation subsystem is built that improves over the currently best general statistical machine translation methods by incorporating special modeling and special features.

Base Noun Phrase Translation Using Web Data and the EM Algorithm

Experimental results indicate that the coverage and accuracy of the method are significantly better than those of the baseline methods relying on existing technologies.

A Probabilistic Model of Compound Nouns

A probabilistic model for syntactically analysing compound noun structures based on knowledge of affinities between nouns, which can be acquired from a corpus is developed.

Multiword Expressions: A Pain in the Neck for NLP

The various kinds of multiword expressions should be analyzed in distinct ways, including listing "words with spaces", hierarchically organized lexicons, restricted combinatoric rules, lexical selection, "idiomatic constructions" and simple statistical affinity.

A Semantic Network of English Verbs

This chapter contains sections titled: The Ocanization Of Verbs In Word Net, Lexical And Semantic Relations Among Verbs And Synsets, Polysemy, Testing The Psychological Validity Of The Wordnet Model,

AMachine Learning Approach to Multiword Expression Extraction

This paper describes the participation in the MWE 2008 evaluation campaign focused on ranking MWE candidates and observed significant performance improvement achieved by methods combining multiple association measures.