A Modular Approach to Learning Dutch Co-reference

Abstract

This paper presents the first machine learning approach to the resolution of co-referential relations between nominal constituents in Dutch. Based on the hypothesis that different types of information sources contribute to a correct resolution of different types (pronominal, proper noun and common noun) of co-referential links, we propose a modular approach in which a separate module is trained per NP type. We present a thorough comparison of two machine learning techniques, a lazy learner and an eager learning approach, trained on the modular tasks as well as on the undecomposed task. In addition, we show that by postprocessing the resulting co-reference chains by means of a string-edit distance correction mechanism, we can avoid some unlikely local chainings and thereby improve precision. Lacking comparative results for Dutch, we also report results on the English MUC-6 and MUC-7 data sets, which are widely used for evaluation.

5 Figures and Tables

Cite this paper

@inproceedings{Hoste2006AMA, title={A Modular Approach to Learning Dutch Co-reference}, author={V{\'e}ronique Hoste and Antal van den Bosch}, year={2006} }