Learn More
The need for lemmatization in inflectionally rich languages is indisputable: it is applicable for the whole range of procedures — from text-search, up to parsing. From two predominant approaches to lemmatization: 1) algorithmic (generally rule-based and realized with FSA) and 2) relational (generally data-driven and realized with databases), this paper(More)
We present the current state of development of the Croatian Dependency Treebank – with special empahsis on adapting the Prague Dependency Treebank formalism to Croatian language specifics – and illustrate its possible applications in an experiment with dependency parsing using MaltParser. The treebank currently contains approximately 2870 sentences, out of(More)
The contribution gives a survey of procedures and formats used in building the Croatian-English parallel corpus which is being collected in the Institute of Linguistics at the Philosophical Faculty, University of Zagreb. The primary text source is newspaper Croatia Weekly which has been published from the beginning of 1998 by HIKZ (Croatian Institute for(More)
Let π be a self-dual supercuspidal representation of GL(N, F) and ρ a supercuspidal representation of Sp(2k, F), with F a local nonarchimedean field of odd residual characteristic. Given a type, indeed a Sp(2N + 2k, F)-cover, for the inertial class [GL(N, F)×Sp(2k, F), π⊗ρ] Sp(2N +2k,F) satisfying suitable hypotheses, we produce a type, indeed a Sp(2tN +(More)
Word-level morphosyntactic descriptions , such as " Ncmsn " designating a common masculine singular noun in the nominative, have been developed for all Slavic languages, yet there have been few attempts to arrive at a proposal that would be harmonised across the languages. Standardisation adds to the interchange potential of the resources , making it easier(More)
This paper presents experiments for enlarging the Croatian Morphological Lexicon by applying an automatic acquisition methodology. The basic sources of information for the system are a set of morphological rules and a raw corpus. The morphological rules have been automatically derived from the existing Croatian Morphological Lexicon and we have used in our(More)