Jean Véronis

Learn More
The automatic disambiguation of word senses ha sb e e n an interest and concern since the earliest days of computer treatment of language in the 1950s. Sense disambiguation is an "intermediate task" (Wilks and Stevenson 1996), which is not an end in itself, but rather is necessary at one level or another to accomplish most natural language processing tasks.(More)
The automatic disambiguation of word senses has been an interest and concern since the earliest days of computer treatment of language in the 1950's. Sense disambiguation is an “intermediate task” (Wilks and Stevenson, 1996) which is not an end in itself, but rather is necessary at one level or another to accomplish most natural language processing tasks.(More)
This article describes an algorithm called HyperLex that is capable of automatically determining word uses in a textbase without recourse to a dictionary. The algorithm makes use of the specific properties of word cooccurrence graphs, which are shown as having "small world" properties. Unlike earlier dictionary-free methods based on word vectors, it can(More)
MULTEXT (Multilingual Text Tools and Corpora) is the largest project funded in the Commission of European Communities Linguistic Research and Engineering Program. The project will contribute to the development of generally usable software tools to manipulate and analyse text corpora and to create multi-lingual text corpora with structural and linguistic(More)
This paper describes two experiments on polysemy judgement and sense annotation. The first experiment enabled us to select the most polysemous words which were used in the second experiment, and which serve as test words for the evaluation of WSD systems. We show that this selection method yields results different from selecting words on the basis of their(More)
In this paper, we describe a means for automatically building very large neural networks (VLNNs) from definition texts in machine-readable dictionaries, and demonstrate the use of these networks for word sense disambiguation. Our method brings together two earlier, independent approaches to word sense disambiguation: the use of machine-readable dictionaries(More)
We present a prosodic corpus in five languages (French, English, Italian, German and Spanish) comprising 4 hours and 20 minutes of speech and involving 50 different speakers (5 male and 5 female per language). The recordings on which the corpus is based are extracted from the EUROM 1 speech database and consists of passages of about five sentences. The(More)
Machine-readable versions of everyday dictionaries have been seen as a likely source of information for use in natural language processing because they contain an enormous amount of lexical and semantic knowledge. However, after 15 years of research, the results appear to be disappointing. No comprehensive evaluation of machine-readable dictionaries (MRDs)(More)