• Corpus ID: 18711566

Scheduling Refresh Queries for Keeping Results from a SPARQL Endpoint Up-to-Date (Extended Version)

@article{Knuth2016SchedulingRQ,
  title={Scheduling Refresh Queries for Keeping Results from a SPARQL Endpoint Up-to-Date (Extended Version)},
  author={Magnus Knuth and Olaf Hartig and Harald Sack},
  journal={ArXiv},
  year={2016},
  volume={abs/1608.08130}
}
Many datasets change over time. As a consequence, long-running applications that cache and repeatedly use query results obtained from a SPARQL endpoint may resubmit the queries regularly to ensure up-to-dateness of the results. While this approach may be feasible if the number of such regular refresh queries is manageable, with an increasing number of applications adopting this approach, the SPARQL endpoint may become overloaded with such refresh queries. A more scalable approach would be to… 

Figures and Tables from this paper

Scheduling Refresh Queries for Keeping Results from a SPARQL Endpoint Up-to-Date (Short Paper)
TLDR
This paper studies the problem of scheduling refresh queries for a large number of registered queries by assuming an overload-avoiding upper bound on the length of a regular time slot available for testing refresh queries.
Change-Aware Scheduling for Effectively Updating Linked Open Data Caches
TLDR
This paper has proposed an approach to efficiently capture the changes and update the cache, called application-aware change prioritization (AACP), which consists of a change metric that quantifies the changes in LOD, and a weight function that assigns importance to recent changes.
A Dynamic, Cost-Aware, Optimized Maintenance Policy for Interactive Exploration of Linked Data
TLDR
A Change Metric is proposed that quantifies the evolution of a Linked Dataset and determines when to update cached content and it is shown that CAMP can reduce maintenance costs, improve maintenance quality and increase cache hit rates compared to standard approaches.

References

SHOWING 1-10 OF 21 REFERENCES
Scheduling Refresh Queries for Keeping Results from a SPARQL Endpoint Up-to-Date (Short Paper)
TLDR
This paper studies the problem of scheduling refresh queries for a large number of registered queries by assuming an overload-avoiding upper bound on the length of a regular time slot available for testing refresh queries.
Improving the Performance of Semantic Web Applications with SPARQL Query Caching
TLDR
This work developed an approach for improving the performance of triple stores by caching query results and even complete application objects and selective invalidation of cache objects, following updates of the underlying knowledge bases.
Strategies for Efficiently Keeping Local Linked Open Data Caches Up-To-Date
TLDR
This paper investigates different strategies proposed in the literature and evaluates them on a large-scale LOD dataset that is obtained from the LOD cloud by weekly crawls over the course of three years and shows that the measures capturing change behavior of LOD sources over time are most suitable for conducting updates.
Enabling Fine-Grained HTTP Caching of SPARQL Query Results
TLDR
It is shown that simple augmentation of the database indexes found in common SPARQL implementations can directly lead to effective caching at the HTTP protocol level.
Queries Independent of Updates
TLDR
New insight into the independence problem is provided by reducing it to the equivalence problem for datalog programs (both for the case of insertion and deletion updates) and new cases in which independence is decidable are presented.
Scalable query result caching for web applications
TLDR
This paper introduces Ferdinand, the first proxy-based cooperative query result cache with fully distributed consistency management, and implements a fully functioning Ferdinand prototype and evaluates its performance compared to several alternative query-caching approaches, showing that high cache hit rate and consistency management are both critical for Ferdinand's performance gains over existing systems.
Interest-Based RDF Update Propagation
TLDR
This paper introduces an approach for interest-based RDF update propagation, which propagates only interesting parts of updates from the source to the target dataset, and enables remote applications to 'subscribe' to relevant datasets and consistently reflect the necessary changes locally without the need to frequently replace the entire dataset or a relevant subset.
sparqlPuSH: Proactive Notification of Data Updates in RDF Stores Using PubSubHubbub
TLDR
This paper presents a flexible approach that provides the active delivery of SPARQL query results through the PubSubHubbub (PuSH) protocol upon the arrival of new information in RDF stores in real-time to any RSS or Atom reader.
Towards Linked Data Update Notifications Reviewing and Generalizing the SparqlPuSH Approach
TLDR
This paper reviews sparqlPuSH approach and introduces the own vision and ideas in extending and generalizing it, and describes a notification service for updates in RDF stores.
A Survey of HTTP Caching Implementations on the Open Semantic Web
TLDR
It is shown by means of a survey of live RDF data sources that caching metadata is prevalent enough already to be used in some cases and point out future directions and give recommendations for the enhanced use of caching in the Semantic Web.
...
...