Local Vector-based Models for Sense Discrimination

Abstract

Word sense discrimination aims at automatically determining which instances of an ambiguous word share the same sense. A fully unsupervised technique based on a high dimensional vector representation of word senses was proposed by Schütze [10]. While this model was assumed to be Gaussian, results were only reported for the K-means approximation. In this work, a local vector-based model of reduced dimensionality which is linguistically coherent and can be computed for multivariate Gaussian mixtures is proposed. Several practical experiments are conducted on the New York Times News 1997 corpus. They show the advantages of unrestricted Gaussian models compared to K-means. The correct discrimination rate is further increased when using regularized Gaussian models as proposed in [2].

2 Figures and Tables

Cite this paper

@inproceedings{Marneffe2005LocalVM, title={Local Vector-based Models for Sense Discrimination}, author={M. de Marneffe and C{\'e}dric Archambeau and Pascal Dupont and Michel Verleysen}, year={2005} }