Web Mining for an Amharic - English Bilingual Corpus

@inproceedings{Argaw2005WebMF,
  title={Web Mining for an Amharic - English Bilingual Corpus},
  author={Atelach Alemu Argaw and Lars Asker},
  booktitle={WEBIST},
  year={2005}
}
We present recent work aimed at constructing a bilingual corpus consisting of comparable Amharic and English news texts. The Amharic and English texts were collected from an Ethiopian news agency that publishes daily news in Amharic and English through their web page. The Amharic texts are represented using Ethiopic script and archived according to the Ethiopian calender. The overlap between the corresponding Amharic and English news texts in the archive is comparatively small, only… CONTINUE READING

References

Publications referenced by this paper.
SHOWING 1-10 OF 13 REFERENCES

Mining the Web for Bilingual Text

VIEW 9 EXCERPTS
HIGHLY INFLUENTIAL

The Web as a Parallel Corpus

  • Computational Linguistics
  • 2003
VIEW 5 EXCERPTS
HIGHLY INFLUENTIAL

Global internet statictics (by language)

GlobalReach
  • http://global-reach.biz/globstats/index.php3.
  • 2004
VIEW 2 EXCERPTS
HIGHLY INFLUENTIAL

Building an amharic lexicon from parallel texts

A. Alemu, L. Asker, G. Eriksson
  • In
  • 2004
VIEW 1 EXCERPT

An empirical approach to building an amharic treebank

A. Alemu, L. Asker, G. Eriksson
  • Proceedings of TLT-2003.
  • 2003
VIEW 1 EXCERPT