David S. Batista

Learn More
Background: Geo-Net-PT is a geospatial ontology representing the Portuguese territory and the relations between the several locations within it. Yahoo! GeoPlanet is a geospatial ontology that covers the whole world. To interlink the two ontologies and reduce the effects of repeated information, we propose an automatic alignment between their administrative(More)
GikiCLEF focused on the evaluation of the reasoning capabilities of systems to provide right answers for geographically-challenging topics. As we did not have previous experience in question answering, we participated in GikiCLEF with the goal of understanding best practices in extracting answers from documents though a hands-on experience. We developed a(More)
Geographic Information Retrieval (GIR) systems rely on the identification and disambiguation of place names in documents to determine the region about which they are relevant. The place names are mapped into geographic concepts and used to assign an encompassing concept (a scope) to each document. However, sometimes a single scope is too restrictive and(More)
Semi-supervised bootstrapping techniques for relationship extraction from text iteratively expand a set of initial seed relationships while limiting the semantic drift. We research bootstrapping for relationship extraction using word embeddings to find similar relationships. Experimental results show that relying on word embeddings achieves a better(More)
A identificação de relações semânticas, expressas entre entidades mencionadas em textos, é um passo importante para a extracção automática de conhecimento a partir de grandes colecções de documentos, tais como a Web. Vários trabalhos anteriores abordaram esta tarefa para o caso da ĺıngua inglesa, usando técnicas de aprendizagem automática supervisionada(More)
We adopted a simple and naive approach based solely on queries over an index of the Knowledge Base text. The queries consist of information gathered from the query string and the support document. This paper describes the prototype used to submit the runs. It is an early version of the prototype we have envisioned, which is, currently a work-in-progress.