Balázs Kis

Learn More
This paper introduces a new approach to morpho-syntactic analysis through Humor 99 (High-speed Unification Mo.rphology), a reversible and unification-based morphological analyzer which has already been integrated with a variety of industrial applications. Humor 99 successfully copes with problems of agglutinative (e.g. Hungarian, Turkish, Esto-nian) and(More)
We apply statistical methods to perform automatic extraction of Hungarian collocations from corpora. Due to the complexity of Hun-garian morphology, a complex resource preparation tool chain has been developed. This tool chain implements a reusable and, in principle , language independent framework. In the first part, the paper describes the tool chain(More)
Texts acquired from recognition sources—conti-nuous speech/handwriting recognition and OCR—generally have three types of errors regardless of the characteristics of the source in particular. The output of the recognition process may be (1) poorly segmented or not segmented at all; (2) containing underspecified symbols (where the recognition process can only(More)
This paper describes an experiment on extracting Hungarian multi-word lexemes from a corpus, using statistical methods. Corpus preparation—the addition of POS tags and stems—was done automatically. From the corpus , verb+noun+casemark patterns were extracted as collocation candidates. Evaluation shows that the statistical methods used by Villada Moirón(More)
  • 1