No more beating about the bush : A Step towards Idiom Handling for Indian Language NLP

@inproceedings{Agrawal2018NoMB,
  title={No more beating about the bush : A Step towards Idiom Handling for Indian Language NLP},
  author={Ruchit Agrawal and Vighnesh Chenthil Kumar and Vigneshwaran Muralidaran and Dipti Misra Sharma},
  booktitle={LREC},
  year={2018}
}
One of the major challenges in the field of Natural Language Processing (NLP) is the handling of idioms; seemingly ordinary phrases which could be further conjugated or even spread across the sentence to fit the context. Since idioms are a part of natural language, the ability to tackle them brings us closer to creating efficient NLP tools. This paper presents a multilingual parallel idiom dataset for seven Indian languages in addition to English and demonstrates its usefulness for two NLP… CONTINUE READING

Tables, Results, and Topics from this paper.

Key Quantitative Results

  • We conclude that Phrase-based SMT is better able to handle idiomatic sentences than Neural Machine Translation, producing an average increase of 2.69 % BLEU score over a baseline model trained over the same corpus.

References

Publications referenced by this paper.

Similar Papers

Loading similar papers…