Building a Test Collection for Sorani Kurdish

  title={Building a Test Collection for Sorani Kurdish},
  author={Kyumars Sheykh Esmaili and Donya Eliassi and Shahin Salavati and Purya Aliabadi and Asrin Mohammadi and Somayeh Yosefi and Shownem Hakimi},
  journal={2013 ACS International Conference on Computer Systems and Applications (AICCSA)},
Despite having a large number of speakers, Sorani - one of the two principle branches of the Kurdish language - is among the less-resourced languages. This paper reports on the outcomes of a project aimed at providing the essential resources for processing Sorani texts. The primary output of this project is Pewan, the first standard Test Collection to evaluate Sorani Information Retrieval systems. The other language resources that we have constructed in this project are: (i) a light-stemmer… CONTINUE READING

From This Paper

Figures, tables, and topics from this paper.


Publications referenced by this paper.
Showing 1-10 of 26 references

Peyamner News Agency

Peyamner, 2013. • 2013
View 5 Excerpts
Highly Influenced

Managing Gigabytes for Java (MG4J)

MG4J, 2013. • 2013
View 3 Excerpts
Highly Influenced

Challenges and Open Problems in Persian Text Processing

M. Shamsfard
Proceedings of LTC’11, 2011. • 2011
View 12 Excerpts
Highly Influenced

Arabic Natural Language Processing: Challenges and Solutions

ACM Trans. Asian Lang. Inf. Process. • 2009
View 11 Excerpts
Highly Influenced

Kurdish Linguistics: A Brief Overview

G. Haig, Y. Matras
Sprachtypologie und Universalienforschung / Language Typology and Universals, vol. 55, no. 1, 2002. • 2002
View 15 Excerpts
Highly Influenced

Building a Kurdish Language Corpus: An Overview of the Technical Problems

G. Gautier
Proceedings of ICEMCO, 1998. • 1998
View 15 Excerpts
Highly Influenced

Similar Papers

Loading similar papers…