Learn More
The paper describes a corpus of texts produced by non-native speakers of Czech. We discuss its annotation scheme, consisting of three interlinked tiers, designed to handle a wide range of error types present in the input. Each tier corrects different types of errors; links between the tiers allow capturing errors in word order and complex discontinuous(More)
Using an error-annotated learner corpus as the basis, the goal of this paper is twofold: (i) to evaluate the practicality of the annotation scheme by computing inter-annotator agreement on a non-trivial sample of data, and (ii) to find out whether the application of automated linguistic annotation tools (tag-gers, spell checkers and grammar checkers) on the(More)
We present an approach to building a learner corpus of Czech, manually corrected and annotated with error tags using a complex grammar based taxonomy of errors in spelling, morphology, morphosyntax, lexicon and style. This grammar-based annotation is supplemented by a formal classification of errors based on surface alternations. To supply additional(More)
We present Korektor – a flexible and powerful purely statistical text correction tool for Czech that goes beyond a traditional spell checker. We use a combination of several language models and an error model to offer the best ordering of correction proposals and also to find errors that cannot be detected by simple spell checkers, namely spelling errors(More)
Dekomprese v popisu jazyka aneb hlubiny i m ˇ elčiny deklarativň e Alexandr Rosen Ústav teoretické a komputační lingvistiky Universita Karlova v Praze • First • Prev • Next • Last • Go Back • Full Screen • Close • Quit " A constraint-based approach to dependency syntax applied to some issues of Czech word order " " Deklarativní formalizace teorie(More)
syntactic information. The lexicon and grammars, enriched by feedback from The authors collect lexical data for the parsed texts, can later be used a module of English syntactic analysis within t}~e machine translation system in the context of a bilingual research proper. project. The computer usable version of OA/JD (Hornby, 1974) is used as the At(More)