Lukasz Kobylinski

The article discusses methods of improving the ways of applying balanced random forests (BRFs), a machine learning classification algorithm, used to extract definitions from written texts. These methods include different approaches to selecting attributes, optimising the classifier prediction threshold for the task of definition extraction and initial(More)
Part-of-Speech (POS) tagging is a crucial task in Natural Language Processing (NLP). POS tags may be assigned to tokens in text manually, by trained linguists, or using algorithmic approaches. Particularly, in the case of annotated text corpora, the quantity of textual data makes it unfeasible to rely on manual tagging and automated methods are used(More)