José Camacho-Collados

Learn More
The semantic representation of individual word senses and concepts is of fundamental importance to several applications in Natural Language Processing. To date, concept modeling techniques have in the main based their representation either on lexicographic resources , such as WordNet, or on encyclope-dic resources, such as Wikipedia. We propose a vector(More)
Semantic representation lies at the core of several applications in Natural Language Processing. However, most existing semantic representation techniques cannot be used effectively for the representation of individual word senses. We put forward a novel multilingual concept representation , called MUFFIN, which not only enables accurate representation of(More)
Lexical taxonomies are graph-like hierarchical structures that provide a formal representation of knowledge. Most knowledge graphs to date rely on is-a (hypernymic) relations as the backbone of their semantic structure. In this paper, we propose a supervised distributional framework for hypernym discovery which operates at the sense level, enabling(More)
We present a new framework for an intrinsic evaluation of word vector representations based on the outlier detection task. This task is intended to test the capability of vector space models to create semantic clusters in the space. We carried out a pilot study building a gold standard dataset and the results revealed two important features: human(More)
Despite being one of the most popular tasks in lexical semantics, word similarity has often been limited to the English language. Other languages, even those that are widely spoken such as Span-ish, do not have a reliable word similarity evaluation framework. We put forward robust methodologies for the extension of existing English datasets to other(More)
Following (L'Homme, 2004), this paper focuses on terms variations in full text in French and more precisely it highlights the semantic ambiguity of terms occurrences with regards to a very high leveled distinction between terminological and general uses. This issue is very present especially in Humanities. For instance, we are interested in distinguishing(More)
Annotation sémantique et validation terminologique en texte intégral en SHS Résumé. Nos travaux se focalisent sur la validation d'occurrences de candidats termes en contexte. Les contextes d'occurrences proviennent d'articles scientifiques des sciences du langage issus du corpus SCIENTEXT 1. Les candidats termes sont identifiés par l'extracteur automatique(More)
In this paper we present BabelDomains, a unified resource which provides lexical items with information about domains of knowledge. We propose an automatic method that uses knowledge from various lexical resources, exploiting both distri-butional and graph-based clues, to accurately propagate domain information. We evaluate our methodology intrinsically on(More)
Linking concepts and named entities to knowledge bases has become a crucial Natural Language Understanding task. In this respect, recent works have shown the key advantage of exploiting textual definitions in various Natural Language Processing applications. However, to date there are no reliable large-scale corpora of sense-annotated textual definitions(More)