Ruslan Mitkov

Learn More
Most traditional approaches to anaphora resolution rely heavily on linguistic and domain knowledge. One of the disadvantages of developing a knowledgebased system, however, is that it is a very labourintensive and time-consuming task. This paper presents a robust, knowledge-poor approach to resolving pronouns in technical manuals, which operates on texts(More)
This paper describes a novel computer-aided procedure for generating multiple-choice test items from electronic documents. In addition to employing various Natural Language Processing techniques, including shallow parsing, automatic term extraction, sentence transformation and computing of semantic distance, the system makes use of language resources such(More)
Summary form only given. The paper describes a novel automatic procedure for the generation of multiple-choice tests from electronic documents. In addition to employing various NLP techniques including term extraction and shallow parsing, the system makes use of language resources such as corpora and ontologies. The system operates in a fully automatic mode(More)
The etymology of the term "anaphora" goes back to Ancient Greek with “anaphora” (αναφορα) being a compound word consisting of the separate words ανα − back, upstream, back in an upward direction and φορα the act of carrying and denoted the act of carrying back upstream. For Computational Linguists embarking upon research in the field of anaphor resolution,(More)
This paper describes a new, advanced and completely revamped version of Mitkov's knowledge-poor approach to pronoun resolution [21]. In contrast to most anaphora resolution approaches, the new system, referred to as MARS, operates in fully automatic mode. It bene ts from purpose-built programs for identifying occurrences of nonnominal anaphora (including(More)
This paper presents a machine learning approach to the study of translationese. The goal is to train a computer system to distinguish between translated and non-translated text, in order to determine the characteristic features that influence the classifiers. Several algorithms reach up to 97.62% success rate on a technical dataset. Moreover, the SVM(More)
The paper discusses the significance of factors in anaphora resolution and on the basis of a comparative study argues that what matters is not only a good set of reliable factors but also the strategy for their application. The objective of the study was to find out how well the same set of factors worked within two different computational strategies. To(More)
Statistical methods to extract translational equivalents from non-parallel corpora hold the promise of ensuring the required coverage and domain customisation of lexicons as well as accelerating their compilation and maintenance. A challenge for these methods are rare, less common words and expressions, which often have low corpus frequencies. However, it(More)
The paper summarises the work of the Research Group in Computational Linguistics at the University of Wolverhampton towards the production of much needed annotated resources for evaluation and training of anaphora resolution systems. In particular, it describes the annotating tools developed to support the annotation, the corpora annotated and the(More)