RDFPeers: a scalable distributed RDF repository based on a structured peer-to-peer network

Abstract

Centralized Resource Description Framework (RDF) repositories have limitations both in their failure tolerance and in their scalability. Existing Peer-to-Peer (P2P) RDF repositories either cannot guarantee to find query results, even if these results exist in the network, or require up-front definition of RDF schemas and designation of super peers. We present a scalable distributed RDF repository (RDFPeers) that stores each triple at three places in a multi-attribute addressable network by applying globally known hash functions to its subject predicate and object. Thus all nodes know which node is responsible for storing triple values they are looking for and both exact-match and range queries can be efficiently routed to those nodes. RDFPeers has no single point of failure nor elevated peers and does not require the prior definition of RDF schemas. Queries are guaranteed to find matched triples in the network if the triples exist. In RDFPeers both the number of neighbors per node and the number of routing hops for inserting RDF triples and for resolving most queries are logarithmic to the number of nodes in the network. We further performed experiments that show that the triple-storing load in RDFPeers differs by less than an order of magnitude between the most and the least loaded nodes for real-world RDF data.

DOI: 10.1145/988672.988760

Extracted Key Phrases

10 Figures and Tables

0204060'04'05'06'07'08'09'10'11'12'13'14'15'16'17
Citations per Year

353 Citations

Semantic Scholar estimates that this publication has 353 citations based on the available data.

See our FAQ for additional information.

Cite this paper

@inproceedings{Cai2004RDFPeersAS, title={RDFPeers: a scalable distributed RDF repository based on a structured peer-to-peer network}, author={Min Cai and Martin R. Frank}, booktitle={WWW}, year={2004} }