• Publications
  • Influence
MBT: A Memory-Based Part of Speech Tagger-Generator
A large-scale application of the memory-based approach to part of speech tagging is shown to be feasible, obtaining a tagging accuracy that is on a par with that of known statistical approaches, and with attractive space and time complexity properties when using IGTree, a tree-based formalism for indexing and searching huge case bases. Expand
Improving Accuracy in Word Class Tagging through the Combination of Machine Learning Systems
We examine how differences in language models, learned by different data-driven systems performing the same NLP task, can be exploited to yield a higher accuracy than the best individual system. WeExpand
Forgetting Exceptions is Harmful in Language Learning
It is shown that in language learning, contrary to received wisdom, keeping exceptional training instances in memory can be beneficial for generalization accuracy, and that decision-tree learning often performs worse than memory-based learning. Expand
Improving Data Driven Wordclass Tagging by System Combination
How the differences in modelling between different data driven systems performing the same NLP task can be exploited to yield a higher accuracy than the best individual system is examined. Expand
Morphosyntactic Tagging of Slovene: Evaluating Taggers and Tagsets
Evaluating tagging techniques on a corpus of Slovene, where a large number of possible word-class tags and only a small (hand-tagged) dataset, shows that PoS accuracy is quite high, while accuracy on Case is lowest, andTagset reduction helps improve accuracy, but less than might be expected. Expand
Resolving PP attachment Ambiguities with Memory-Based Learning
The application of Memory-Based Learning to the problem of Prepositional Phrase attachment disambiguation is described and the method compares favorably to previous methods, and is well-suited to incorporating various unconventional representations of word patterns such as value difference metrics and Lexical Space. Expand