- Full text PDF available (2)
Data Set Used
Despite having a large number of speakers, Sorani - one of the two principle branches of the Kurdish language - is among the less-resourced languages. This paper reports on the outcomes of a project aimed at providing the essential resources for processing Sorani texts. The primary output of this project is Pewan, the first standard Test Collection to… (More)
In this paper we highlight the main challenges in building a lexical database for Kurdish, a resource-scarce and diverse language. We also report on our effort in building the first prototype of KurdNet – the Kurdish WordNet– along with a preliminary evaluation of its impact on Kur-dish information retrieval.
Recently, we reported on our efforts to build the first prototype of KurdNet. In this proposal, we highlight the shortcomings of the current prototype and put forward a detailed plan to transform this prototype to a full-fledged lexical database for the Kurdish language.