Vera Lúcia Strube de Lima

Learn More
This paper explores different strategies for extracting similarity relations between words from parsed text corpora. The strategies we have analysed do not require supervised training nor semantic information available from general lexical resources. They differ in the amount and the quality of the syntactic contexts used to compare words. The paper(More)
In this paper I present briefly Linguateca, an infrastructure project for Portuguese which is ten years old, and will show how it provides several possibilities to study grammatical and semantical differences between varieties of the language. After a short history of Portuguese corpus linguistics, presenting the main projects in the area, I discuss in some(More)
based query expansion method for information retrieval. The query expansion process assigns weights to different types of relations obtained from vocabulary structures, providing an efficient way to measure distances between different terms. This method was applied to a Portuguese juridical corpus and evaluated over the top-27 queries used in the web site(More)
Open Information Extraction (Open IE) is an unsupervised strategy to draw out relations from text without predefining these relations, regardless the domain. This paper describes a novel Open IE approach that performs unsupervised extraction of triples by applying a few lexical-syntactic patterns to POS-tagged texts. In order to validate this strategy we(More)
This paper describes a proposal for Portuguese possessive pronominal anaphor (PPA) resolution, a problem little considered so far. Particularly, we address the problem of Portuguese 3rd person intrasentential PPAs seu/sua/seus/suas (his/her/their/its, for human and non-human subjects in English), which constitute 30% of pronominal occurrences in our corpus(More)
Open Information Extraction (Open IE) aims to obtain not predefined, domain-independent relations from text. This article introduces the Open IE research field, thoroughly discussing the main ideas and systems in the area as well as its main challenges and open issues. The paper describes an open extractor elaborated from the belief that it is not necessary(More)
Open Information Extraction (Open IE) is a strategy for learning relations from texts, regardless the domain and without predefining these relations. Work in this area has focused mainly on verbal relations. In order to extend Open IE to extract relationships that are not expressed by verbs, we present a novel Open IE approach that extracts relations(More)
Biomedical Named Entities (NEs) are phrases or combinations of phrases that denote specific objects or groups of objects in the biomedical literature. Research on Named Entity Recognition (NER) is one of the most disseminated activities in the automatic processing of biomedical scientific articles. We analyzed articles relevant to NER in biomedical texts,(More)