Yves Peirsman

Learn More
This paper details the coreference resolution system submitted by Stanford at the CoNLL2011 shared task. Our system is a collection of deterministic coreference resolution models that incorporate lexical, syntactic, semantic, and discourse information. All these models use global document-level information by sharing mention attributes, such as gender and(More)
We propose a new deterministic approach to coreference resolution that combines the global information and precise features of modern machine-learning models with the transparency and modularity of deterministic, rule-based systems. Our sieve architecture applies a battery of deterministic coreference models one at a time from highest to lowest precision,(More)
Vector-based models of lexical semantics retrieve semantically related words automatically from large corpora by exploiting the property that words with a similar meaning tend to occur in similar contexts. Despite their increasing popularity, it is unclear which kind of semantic similarity they actually capture and for which kind of words. In this paper, we(More)
We describe a cross-lingual method for the induction of selectional preferences for resourcepoor languages, where no accurate monolingual models are available. The method uses bilingual vector spaces to “translate” foreign language predicate-argument structures into a resource-rich language like English. The only prerequisite for constructing the bilingual(More)
Metonymy recognition is generally approached with complex algorithms that rely heavily on the manual annotation of training and test data. This paper will relieve this complexity in two ways. First, it will show that the results of the current learning algorithms can be replicated by the ‘lazy’ algorithm of Memory-Based Learning. This approach simply stores(More)
Bilingual lexicons, essential to many NLP applications, can be constructed automatically on the basis of parallel or comparable corpora. In this article, we make two contributions to their induction from comparable corpora. The first one concerns the creation of these lexicons. We show that seed lexicons can be improved by adding a bootstrapping procedure(More)
Similarity is the key notion underlying many contemporary theories about the representation of meaning through words or concepts. However, these representations are strongly colored by the kind of information captured by various semantic measures. In this paper we present a systematic comparison of human similarity judgments and calculated similarity(More)
In the recognition of words that are typical of a specific language variety, the classic keyword approach performs rather poorly. We show how this keyword analysis can be complemented with a word space model constructed on the basis of two corpora: one representative of the language variety under investigation, and a reference corpus. This combined approach(More)