An Experimental Comparison of RDF Data Management Approaches in a SPARQL Benchmark Scenario
@inproceedings{Schmidt2008AnEC, title={An Experimental Comparison of RDF Data Management Approaches in a SPARQL Benchmark Scenario}, author={Michael Schmidt and Thomas Hornung and Norbert K{\"u}chlin and Georg Lausen and Christoph Pinkel}, booktitle={SEMWEB}, year={2008} }
Efficient RDF data management is one of the cornerstones in realizing the Semantic Web vision. In the past, different RDF storage strategies have been proposed, ranging from simple triple stores to more advanced techniques like clustering or vertical partitioning on the predicates. We present an experimental comparison of existing storage strategies on top of the SP2Bench SPARQL performance benchmark suite and put the results into context by comparing them to a purely relational model of the…
94 Citations
SW-Store: a vertically partitioned DBMS for Semantic Web data management
- Computer ScienceThe VLDB Journal
- 2008
The results show that a vertically partitioned schema achieves similar performance to the property table technique while being much simpler to design, and the architecture of SW-Store, a new DBMS that implements these techniques to achieve high performance RDF data management.
Efficient querying of multidimensional RDF data with aggregates: Comparing NoSQL, RDF and relational data stores
- Computer ScienceInt. J. Inf. Manag.
- 2020
Benchmarking Spark-SQL under Alliterative RDF Relational Storage Backends
- Computer ScienceQuWeDa@ISWC
- 2019
A systematic comparison of there relevant RDF relational schemas queried using Apache Spark shows many interesting insights about the impact of the relational encoding scheme, storage backends and storage formats on the performance of the query execution process.
An In-depth Investigation of Large-scale RDF Relational Schema Optimizations Using Spark-SQL
- Computer ScienceDOLAP
- 2021
One of the most significant challenges of large-scale RDF data processing over Apache Spark, the relational schema optimization is discussed and insights into these schemas’ relative strengths are provided by comparing three different partitioning techniques and four other storage formats.
Compressed vertical partitioning for efficient RDF management
- Computer ScienceKnowledge and Information Systems
- 2014
This article introduces a novel RDF indexing technique that supports efficient SPARQL solution in compressed space and enhances this model with two compact indexes listing the predicates related to each different subject and object in the dataset, in order to address the specific weaknesses of vertically partitioned representations.
Towards making sense of Spark-SQL performance for processing vast distributed RDF datasets
- Computer ScienceSBD@SIGMOD
- 2020
A systematic evaluation of the performance of SparkSQL engine for processing SPARQL queries using three relevant RDF relational schemas, and two different storage backends, namely, Hive, and HDFS is presented.
FlexTable: Using a Dynamic Relation Model to Store RDF Data
- Computer ScienceDASFAA
- 2010
This paper proposes a system called FlexTable, where all triples of an instance are coalesced into one tuple and all tuples are stored in relation schemas, based on a lattice structure to automatically evolve schemas while new triples are inserted.
NoSQL Databases for RDF: An Empirical Evaluation
- Computer ScienceSEMWEB
- 2013
This work is the first systematic attempt at characterizing and comparing NoSQL stores for RDF processing and compares their key characteristics when running standard RDF benchmarks on a popular cloud infrastructure using both single-machine and distributed deployments.
Scalable and Efficient Self-Join Processing technique in RDF data
- Computer ScienceArXiv
- 2014
An alternative solution to facilitate flexibility and efficiency in that queries and try to reach to the optimal solution to decrease the self-joins as much as possible, this solution based on the idea of "Recursive Mapping of Twin Tables".
Compressed Vertical Partitioning for Efficient RDF Management 1
- Computer Science
- 2013
A novel RDF indexing technique that supports efficient SPARQL solution in compressed space that achieves by far the most compressed representations, but also achieves the best overall performance for RDF retrieval in the authors' experimental setup.
References
SHOWING 1-10 OF 25 REFERENCES
Scalable Semantic Web Data Management Using Vertical Partitioning
- Computer ScienceVLDB
- 2007
The results show that a vertical partitioned schema achieves similar performance to the property table technique while being much simpler to design, and if a column-oriented DBMS is used instead of a row-oriented database, another order of magnitude performance improvement is observed, with query times dropping from minutes to several seconds.
Column-store support for RDF data management: not all swans are white
- Computer ScienceProc. VLDB Endow.
- 2008
This paper reports on the results of an independent evaluation of the techniques presented in the VLDB 2007 paper "Scalable Semantic Web Data Management Using Vertical Partitioning", as well as a complementary analysis of state-of-the-art RDF storage solutions.
SP2Bench: A SPARQL Performance Benchmark
- Computer ScienceSemantic Web Information Management
- 2009
SP^2Bench, a publicly available, language-specific SPARQL performance benchmark, which comprises both a data generator for creating arbitrarily large DBLP-like documents and a set of carefully designed benchmark queries.
SP^2Bench: A SPARQL Performance Benchmark
- Computer Science2009 IEEE 25th International Conference on Data Engineering
- 2009
SP^2Bench, a publicly available, language-specific SPARQL performance benchmark, which comprises both a data generator for creating arbitrarily large DBLP-like documents and a set of carefully designed benchmark queries.
Storing RDF as a graph
- Computer ScienceProceedings of the IEEE/LEOS 3rd International Conference on Numerical Simulation of Semiconductor Optoelectronic Devices (IEEE Cat. No.03EX726)
- 2003
This work presents a new approach to store RDF data as a graph in a object-oriented database, which avoids the costly rebuilding of the graph and efficiently queries the storage structure directly.
An Efficient SQL-based RDF Querying Scheme
- Computer ScienceVLDB
- 2005
An experimental study characterizing the overhead eliminated by avoiding procedural code at runtime, characterizing performance under various input conditions, and demonstrating scalability using 80 million RDF triples from UniProt protein and annotation data are presented.
The Berlin SPARQL Benchmark
- Computer ScienceInt. J. Semantic Web Inf. Syst.
- 2009
The Berlin SPARQL Benchmark (BSBM) is introduced, built around an e-commerce use case in which a set of products is offered by different vendors and consumers have posted reviews about products, and emulates the search and navigation pattern of a consumer looking for a product.
Sesame: A Generic Architecture for Storing and Querying RDF and RDF Schema
- Computer ScienceSEMWEB
- 2002
This work presents an overview of Sesame, an architecture for efficient storage and expressive querying of large quantities of metadata in RDF and RDF Schema, and its implementation and the first experiences with this implementation.
Benchmarking Database Representations of RDF/S Stores
- Computer ScienceSEMWEB
- 2005
The main conclusion drawn from the experiments is that the evaluation of taxonomic queries is most efficient over RDF/S stores utilizing the Hybrid and MatView representations.
Jena Property Table Implementation
- Computer Science
- 2006
This paper describes a property table design and implementation for Jena, an RDF Semantic Web toolkit, and a design goal is to make Jena property tables look like normal relational database tables.