• Publications
  • Influence
MULTEXT-East free lexicons 4.0
TLDR
This submission contains the freely available MULTEXT-East lexicons, while a separate submission (http://hdl.handle.net/11356/1042) gives those that are available only for non-commercial use. Expand
The VoiceTRAN Speech Translation Demonstrator
This paper describes the design phases of the VoiceTRAN Communicator, which integrates speech recognition, machine translation, and text-to-speech synthesis using the Galaxy architecture. The aim ofExpand
Morphological lexicon Sloleks 1.2
Sloleks is the reference morphological lexicon for Slovenian language, developed to be used in NLP applications and language manuals. Encoded in LMF XML, the lexicon contains approx. 100.000 mostExpand
Corpus of comma placement Vejica 1.0
A collection of sentences demonstrating and correcting comma usage. The sentences come from four sources: - KUST: a Slovene learner corpus, http://nl.ijs.si/isjt06/proc/26_Stritar.pdf - Solar: aExpand
The System For Co-Reference Resolution For Slovenian Texts Analysis and Possibilities of its Use
TLDR
Co-reference resolution was used in the question answering system Crammer, which can, as a result, answer more questions than before, because it can replace personal pronouns. Expand
Automatic generation of textual logic puzzles in Slovenian
TLDR
Creating textual logic puzzles requires quite a lot of work. Expand
Zbirka primerov rabe vejice Vejica 1.3
With this year's conference we are celebrating the 20th anniversary since the first conference »Language technologies« which took place in 1998 in Cankarjev dom, Ljubljana and was organized by TomažExpand
Morphological lexicon Sloleks 2.0
TLDR
Sloleks is the reference morphological lexicon for Slovenian language, developed to be used in NLP applications and language manuals. Expand
Written corpus ccKres 1.0
TLDR
Corpus ccKres consists of 9,376 documents, each containing information about the source (e.g. newspapers, magazines), year of publication, text type (fiction, newspaper), the title and author if they are known. Expand
Written corpus ccGigafida 1.0
TLDR
Corpus ccGigafida consists of paragraph samples from 31,722 documents, each containing information about the source (e.g. newspapers, magazines), year of publication, text type (fiction, newspaper), the title and author if they are known. Expand
...
1
2
...