Learn More
This short paper aims at presenting a method for automatically extracting and evaluating MWE in the Europarl corpus. For this purpose we make use of mwetoolkit and utilize its output to find rules for the automatic evaluation of MWE. We then developed an XML parser to evaluate MWE candidates against those rules and also against online dictionaries. A sample(More)
In this paper we present a tool for the automatic extraction of subcategorization frames from Portuguese corpora. Subcategorization frames are important to many Natural Language Processing (NLP) tasks, such as the improvement of parsing results. The tool presented here, which is based on a system developed for French, comes to fill a gap in Portuguese,(More)
We introduce a new multilingual resource containing judgments about nominal compound compositionality in English, French and Por-tuguese. It covers 3 × 180 noun-noun and adjective-noun compounds for which we provide numerical compositionality scores for the head word, for the modifier and for the compound as a whole, along with possible paraphrases. This(More)
This paper aims at presenting a methodology for semi-automatic validation of an wide-coverage ontology based on an existing electronic resource, PAPEL. From the existing relations, we choose those of synonymy and hyper-nymy to generate the ontology. The resulting output was converted to OWL format e manually validated by a lexicographer. As result, we have(More)
Automatic lexical alignment is a vital step for empirical machine translation, and although good results can be obtained with existent models (e.g. Giza++), more precise alignment is still needed for successfully handling complex constructions such as multiword expressions. In this paper we propose an approach for lexical alignment combining statistical and(More)