- Full text PDF available (26)
- This year (1)
- Last 5 years (16)
- Last 10 years (23)
Journals and Conferences
Data Set Used
The availability of annotated data (with as rich and “deep” annotation as possible) is desirable in any new developments. Textual data are being used for so-called training phase of various empirical methods solving various problems in the field of computational linguistics. While there are many methods that use texts in their plain (or raw) form (in most… (More)
p u r p o s e s , i t h a s b e e n t a g g e d b y o u r t a g g e r ; e r r o r s a r e p r i n t e d u n d e r l i n e d a n d c o r r e c t i o n s a r e s h o w n . } Hlavnfm/AAIS7 .... IA-probl4mem/NNIS7 ..... A--
The contents of the Prague Dependency Treebank (recently released by the Linguistic Data Consortium in its version 1.0) is described, from morphology to surface syntax to the deep (underlying) syntax layers of annotation. For each layer, the basic assumptions are given, followed by a more detailed description of the annotation scheme. Annotation software… (More)
We present results of probabilistic tagging of Czech texts in order to show how these techniques work for one of the highly morphologically ambiguous inflective languages. After description of the tag system used, we show the results of four experiments using a simple probabilistic model to tag Czech texts (unigram, two bigram experiments, and a t r igram… (More)
We propose the PlayCoref game, whose purpose is to obtain substantial amount of text data with the coreference annotation. We provide a description of the game design that covers the strategy, the instructions for the players, the input texts selection and preparation, and the score evaluation.
PlayCoref is a concept of an on-line language game designed to acquire a substantial amount of text data with the coreference annotation. We describe in detail various aspects of the game design and discuss features that affect the quality of the annotation.
The RExtractor system is an information extractor that processes input documents by natural language processing tools and consequently queries the parsed sentences to extract a knowledge base of entities and their relations. The extraction queries are designed manually using a tool that enables natural graphical representation of queries over dependency… (More)