PAIS`A is a Creative Commons licensed, large web corpus of contemporary Italian. We describe the design, harvesting, and processing steps involved in its creation.
The legal knowledge base resulting from the LOIS (Lexical Ontologies for legal Information Sharing) project consists of legal WordNets in six languages (Italian, Dutch, Portuguese, German, Czech, English). Its architecture is based on the EuroWordNet (EWN) framework (Vossen et al, 1997). Using the EWN framework assures compatibility of the LOIS WordNets… (More)
This work introduces SYMPAThy, a data representation model in which the com-binatorial properties of a lexical item are described by merging surface and deeper linguistic information. The proposed approach is then evaluated by comparing, for a sample list of verbal idioms, a set of SYMPAThy-based fixedness indexes against the relevant speaker-elicited… (More)
An established method for MWE extraction is the combined use of previously identified POS-patterns and association measures. However, the selection of such POS-patterns is rarely debated. Focusing on Ital-ian MWEs containing at least one adjective , we set out to explore how candidate POS-patterns listed in relevant literature and lexicographic sources… (More)