Translation Induction on Indian Language Corpora Using Translingual Themes from Other Languages

@inproceedings{Tholpadi2015TranslationIO,
  title={Translation Induction on Indian Language Corpora Using Translingual Themes from Other Languages},
  author={Goutham Tholpadi and Chiranjib Bhattacharyya and Shirish K. Shevade},
  booktitle={CICLing},
  year={2015}
}
Identifying translations from comparable corpora is a well-known problem with several applications, e.g. dictionary creation in resource-scarce languages. Scarcity of high quality corpora, especially in Indian languages, makes this problem hard, e.g. state-of-the-art techniques achieve a mean reciprocal rank (MRR) of 0.66 for English-Italian, and a mere 0.187 for Telugu-Kannada. There exist comparable corpora in many Indian languages with other “auxiliary” languages. We observe that… CONTINUE READING

Figures, Tables, and Topics from this paper.

Citations

Publications citing this paper.

WikiDocsAligner: An Off-the-Shelf Wikipedia Documents Alignment Tool

  • Motaz Saad, Basem O. Alijla
  • Computer Science
  • 2017 Palestinian International Conference on Information and Communication Technology (PICICT)
  • 2017
VIEW 1 EXCERPT

References

Publications referenced by this paper.
SHOWING 1-10 OF 55 REFERENCES

The TREC-8 Question Answering Track Report

VIEW 11 EXCERPTS
HIGHLY INFLUENTIAL

Omnipedia: bridging the wikipedia language gap

VIEW 1 EXCERPT