Source Language Markers in EUROPARL Translations

@inproceedings{Halteren2008SourceLM,
  title={Source Language Markers in EUROPARL Translations},
  author={Hans van Halteren},
  booktitle={COLING},
  year={2008}
}
This paper shows that it is very often possible to identify the source language of medium-length speeches in the EUROPARL corpus on the basis of frequency counts of word n-grams (87.2%--96.7% accuracy depending on classification method). The paper also examines in detail which positive markers are most powerful and identifies a number of linguistic aspects as well as culture- and domain-related ones. 

Citations

Publications citing this paper.
SHOWING 1-10 OF 59 CITATIONS

FILTER CITATIONS BY YEAR

2009
2019

CITATION STATISTICS

  • 5 Highly Influenced Citations

References

Publications referenced by this paper.