SW-Store: a vertically partitioned DBMS for Semantic Web data management

@article{Abadi2008SWStoreAV,
  title={SW-Store: a vertically partitioned DBMS for Semantic Web data management},
  author={Daniel J. Abadi and Adam Marcus and Samuel Madden and Katherine J. Hollenbach},
  journal={The VLDB Journal},
  year={2008},
  volume={18},
  pages={385-406}
}
Efficient management of RDF data is an important prerequisite for realizing the Semantic Web vision. Performance and scalability issues are becoming increasingly pressing as Semantic Web technology is applied to real-world applications. In this paper, we examine the reasons why current data management solutions for RDF data scale poorly, and explore the fundamental scalability limitations of these approaches. We review the state of the art for improving performance of RDF databases and consider… 
SPARTI: Scalable RDF Data Management Using Query-Centric Semantic Partitioning
TLDR
This paper investigates SPARTI, a scalable RDF data management system that combines a budgeting mechanism with a cost model to determine the worthiness of partitioning and is shown to execute queries around half the time over all query shapes while maintaining around an order of magnitude enhancement in storage requirements.
Reverse Partitioning for SPARQL Queries: Principles and Performance Analysis
TLDR
A new partitioning technique dedicated to graph-based triple stores that is complementary to traditional ones is presented and the best classes of queries for which reverse partitioning gives better performance are discussed.
RDF-4X: a scalable solution for RDF quads store in the cloud
TLDR
This paper proposes a scalable solution for RDF data management that uses Apache Accumulo, introducing storage methods and indexing techniques that scale to billions of quads across multiple nodes, while providing fast and easy access to the data through conventional query mechanisms such as SPARQL.
Compressed vertical partitioning for efficient RDF management
TLDR
This article introduces a novel RDF indexing technique that supports efficient SPARQL solution in compressed space and enhances this model with two compact indexes listing the predicates related to each different subject and object in the dataset, in order to address the specific weaknesses of vertically partitioned representations.
Executing queries over schemaless RDF databases
TLDR
A physical representation that is schemaless is proposed that enables an RDF dataset to be clustered based purely on the workload, which is key to achieving good performance through optimized I/O and cache utilization and a new query evaluation model is designed that leverages this workload-aware clustering of the database.
Compressed Vertical Partitioning for Efficient RDF Management 1
TLDR
A novel RDF indexing technique that supports efficient SPARQL solution in compressed space that achieves by far the most compressed representations, but also achieves the best overall performance for RDF retrieval in the authors' experimental setup.
SemStore: A Semantic-Preserving Distributed RDF Triple Store
TLDR
This paper addresses the challenging problems of data partitioning and query optimization in a scale-out RDF engine by proposing a radically different approach, where a coarse-grained structure, namely Rooted Sub-Graph (RSG), is used as the partition unit.
Evaluation of RDF queries via equivalence
TLDR
This paper proposes an alternative open user schema that can accommodate schema updates without possible long-chain joins, and implements and provides empirical evaluations to demonstrate both the efficiency and effectiveness of the approach in evaluating complex RDF queries.
String-Based Semantic Web Data Management Using Ternary B-Trees
TLDR
This work proposes the ternary B-tree as a new data structure for storing and accessing RDF, string-based, making use of the intrinsic features of RDF.
A data distribution model for RDF
TLDR
This paper presents an RDF data distribution method which overcomes the shortcomings of the current approaches in order to scale RDF storage both on the volume of data and query processing and is effective to improve the overall performance by decreasing the amount of message passing among servers.
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 51 REFERENCES
An Experimental Comparison of RDF Data Management Approaches in a SPARQL Benchmark Scenario
TLDR
An experimental comparison of existing storage strategies on top of the SP2Bench SPARQL performance benchmark suite is presented and it is concluded that future research is necessary to further bring forward RDF data management.
Hexastore: sextuple indexing for semantic web data management
TLDR
This paper proposes an RDF storage scheme that uses the triple nature of RDF as an asset, which confers significant advantages compared to previous approaches for RDF data management, at the price of a worst-case five-fold increase in index space.
An Efficient SQL-based RDF Querying Scheme
TLDR
An experimental study characterizing the overhead eliminated by avoiding procedural code at runtime, characterizing performance under various input conditions, and demonstrating scalability using 80 million RDF triples from UniProt protein and annotation data are presented.
Storing RDF as a graph
  • Valerie Bönström, A. Hinze, H. Schweppe
  • Computer Science
    Proceedings of the IEEE/LEOS 3rd International Conference on Numerical Simulation of Semiconductor Optoelectronic Devices (IEEE Cat. No.03EX726)
  • 2003
TLDR
This work presents a new approach to store RDF data as a graph in a object-oriented database, which avoids the costly rebuilding of the graph and efficiently queries the storage structure directly.
Efficient RDF Storage and Retrieval in Jena2
TLDR
This paper describes the persistence subsystem of Jena2 which is intended to support large datasets and query optimization for RDF is identified as a promising area for future research.
The ICS-FORTH RDFSuite: Managing Voluminous RDF Description Bases
TLDR
This paper advocate the use of database technology to support declarative access, as well as, logical and physical independence for voluminous RDF description bases, and presents RDFSuite, a suite of tools for RDF validation, storage and querying.
Sesame: A Generic Architecture for Storing and Querying RDF and RDF Schema
TLDR
This work presents an overview of Sesame, an architecture for efficient storage and expressive querying of large quantities of metadata in RDF and RDF Schema, and its implementation and the first experiences with this implementation.
Jena Property Table Implementation
TLDR
This paper describes a property table design and implementation for Jena, an RDF Semantic Web toolkit, and a design goal is to make Jena property tables look like normal relational database tables.
An Effective SPARQL Support over Relational Databases
TLDR
An effective approach to support SPARQL queries over relational databases is proposed, with the above challenges in mind, and a novel facet-based scheme is designed to handle filter expressions.
Relational Databases for Querying XML Documents: Limitations and Opportunities
TLDR
It turns out that the relational approach can handle most (but not all) of the semantics of semi-structured queries over XML data, but is likely to be effective only in some cases.
...
1
2
3
4
5
...