Learn More
This short paper aims at presenting a method for automatically extracting and evaluating MWE in the Europarl corpus. For this purpose we make use of mwetoolkit and utilize its output to find rules for the automatic evaluation of MWE. We then developed an XML parser to evaluate MWE candidates against those rules and also against online dictionaries. A sample(More)
In this paper we present a tool for the automatic extraction of subcategorization frames from Portuguese corpora. Subcategorization frames are important to many Natural Language Processing (NLP) tasks, such as the improvement of parsing results. The tool presented here, which is based on a system developed for French, comes to fill a gap in Portuguese,(More)
We introduce a new multilingual resource containing judgments about nominal compound compositionality in English, French and Por-tuguese. It covers 3 × 180 noun-noun and adjective-noun compounds for which we provide numerical compositionality scores for the head word, for the modifier and for the compound as a whole, along with possible paraphrases. This(More)
Semantic role labeling offers vital information for both Linguistics and Natural Language Processing tasks. In this article, we present a lexical resource for Portuguese annotated with semantic roles: VerbLexPor. The resource is a database with verbs and sentences extracted from both a domain specific corpus and a non-specialized generic one. Annotation was(More)