Structured Generative Models for Unsupervised Named-Entity Clustering

Abstract

We describe a generative model for clustering named entities which also models named entity internal structure, clustering related words by role. The model is entirely unsupervised; it uses features from the named entity itself and its syntactic context, and coreference information from an unsupervised pronoun resolver. The model scores 86% on the MUC-7 named-entity dataset. To our knowledge, this is the best reported score for a fully unsupervised model, and the best score for a generative model.

View Slides

Extracted Key Phrases

Cite this paper

@inproceedings{Elsner2009StructuredGM, title={Structured Generative Models for Unsupervised Named-Entity Clustering}, author={Micha Elsner and Eugene Charniak and Mark Johnson}, booktitle={HLT-NAACL}, year={2009} }