Learn More
In this demonstration we present XTaGe (XML Tester and Generator), a flexible tool for the creation of complex XML collections. XTaGe focuses on XML collections with complex structural constraints and domain-specific characteristics, which would be very difficult or impossible to replicate using existing XML generators. It addresses the limitations of(More)
This paper presents a novel method for semantic annotation and search of a target corpus using several knowledge resources (KRs). This method relies on a formal statistical framework in which KR concepts and corpus documents are homogeneously represented using statistical language models. Under this framework, we can perform all the necessary operations for(More)
We introduce XTaGe (XML Tester and Generator), a system for the synthesis of XML collections meant for testing and micro-benchmarking applications. In contrast with existing approaches, XTaGe focuses on complex collections, by providing a highly extensible framework to introduce controlled variability in XML structures. In this paper we present the(More)
Open metadata registries are a fundamental tool for researchers in the Life Sciences trying to locate resources such as web services or databases. While sophisticated standards have been produced for annotating these resources with rich, well-structured metadata, evidence shows that in open registries a majority of annotations simply consists of informal(More)
Research in the Life Sciences depends on the integration of large, distributed and heterogeneous data sources and web services. The discovery of which of these resources are the most appropriate to solve a given task is a complex research question, since there is a large amount of plausible candidates and there is little, mostly unstructured, metadata to be(More)
Biomedical knowledge resources (KRs) are mainly expressed in English, and many applications using them suffer from the scarcity of knowledge in non-English languages. The goal of the present work is to take maximum profit from existing multilingual biomedical KRs lexicons to enrich their non-English counterparts. We propose to combine different automatic(More)