Learn More
The paper describes Obeliks, a new statistical tagger for Slovene developed within the "Communication in Slovene" project. The new tool consists of three modules: a rule-based sentence splitter and tokenizer, a morphosyntactic tagger, and a version of the LemmaGen lemmatizer which works in combination with the tagger. Obeliks is trained on the ssj500k(More)
This paper addresses cross-lingual dependency parsing using rich morphosyntac-tic tagsets. In our case study, we experiment with three related Slavic languages: Croatian, Serbian and Slovene. Four different dependency treebanks are used for monolingual parsing, direct cross-lingual parsing, and a recently introduced cross-lingual parsing approach that(More)
Within the traditional discourse structure research, multi-word discourse markers have usually been explored as one of the possible structural realizations for expressing discourse relations in texts, accordingly named either alternative lexicalizations (Prasad et al., 2010), second-level discourse markers (Siepmann, 2005) or secondary connectives (Rysová(More)
  • 1