Learn More
In this paper, we discuss the architecture and implementation of the Semantic Web Search Engine (SWSE). Following traditional search engine architecture, SWSE consists of crawling, data enhancing, indexing and a user interface for search, browsing and retrieval of information; unlike traditional search engines, SWSE operates over RDF Web data – loosely also(More)
Hundreds of public SPARQL endpoints have been deployed on the Web, forming a novel decentralised infrastructure for querying billions of structured facts from a variety of sources on a plethora of topics. But is this infrastructure mature enough to support applications? For 427 public SPARQL endpoints registered on the DataHub, we conduct various(More)
We present the architecture of an end-to-end semantic search engine that uses a graph data model to enable interactive query answering over structured and interlinked data collected from many disparate sources on the Web. In particular, we study distributed indexing methods for graph-structured data and parallel query evaluation methods on a cluster of(More)
Over a decade after RDF has been published as a W3C recommendation , publishing open and machine-readable content on the Web has recently received a lot more attention, including from corporate and governmental bodies; notably thanks to the Linked Open Data community, there now exists a rich vein of heterogeneous RDF data published on the Web (the so-called(More)
In this paper, we present the design and first results of the Dynamic Linked Data Observatory: a long-term experiment to monitor the two-hop neighbourhood of a core set of eighty thousand diverse Linked Data documents on a weekly basis. We present the methodology used for sampling the URIs to monitor, retrieving the documents, and further crawling part of(More)
BACKGROUND Several query federation engines have been proposed for accessing public Linked Open Data sources. However, in many domains, resources are sensitive and access to these resources is tightly controlled by stakeholders; consequently, privacy is a major concern when federating queries over such datasets. In the Healthcare and Life Sciences (HCLS)(More)
There are hundreds of SPARQL endpoints on the Web, but finding an endpoint relevant to a client's needs is difficult: each endpoint acts like a black box, often without a description of its content. Herein we briefly describe Sportal: a system that collects meta-data about the content of endpoints and collects them into a central catalogue over which(More)