Karl Schnaitter

Learn More
In the rank join problem, we are given a set of relations and a scoring function, and the goal is to return the join results with the top <i>K</i> scores. It is often the case in practice that the inputs may be accessed in ranked order and the scoring function is monotonic. These conditions allow for efficient algorithms that solve the rank join problem(More)
The physical schema of a database plays a critical role in performance. Self-tuning is a cost-effective and elegant solution to optimize the physical configuration for the characteristics of the query load. Existing techniques operate in an off-line fashion, by choosing a fixed configuration that is tailored to a subset of the query load. The generated(More)
In the rank join problem, we are given a set of relations and a scoring function, and the goal is to return the join results with the top <i>k</i> scores. It is often the case in practice that the inputs may be accessed in ranked order and the scoring function is monotonic. These conditions allow for efficient algorithms that solve the rank join problem(More)
This paper introduces COLT (continuous on-line tuning), a novel framework that continuously monitors the workload of a database system and enriches the existing physical design with a set of effective indices. The key idea behind COLT is to gather performance statistics at different levels of detail and to carefully allocate profiling resources to the most(More)
We describe an algorithm that evaluates queries over probabilistic databases using Mobius' inversion formula in incidence algebras. The queries we consider are unions of conjunctive queries (equivalently: existential, positive First Order sentences), and the probabilistic databases are tuple-independent structures. Our algorithm runs in PTIME on a subset of(More)
One of the key tasks of a database administrator is to optimize the set of materialized indices with respect to the current workload. To aid administrators in this challenging task, commercial DBMSs provide advisors that recommend a set of indices based on a sample workload. It is left for the administrator to decide which of the recommended indices to(More)
A relational ranking query uses a scoring function to limit the results of a conventional query to a small number of the most relevant answers. The increasing popularity of this query paradigm has led to the introduction of specialized rank join operators that integrate the selection of top tuples with join processing. These operators access just “enough”(More)
Graph analytics is an important big data discovery technique. Applications include identifying influential employees for retention, detecting fraud in a complex interaction network, and determining product affinities by exploiting community buying patterns. Specialized platforms have emerged to satisfy the unique processing requirements of large-scale graph(More)
Online approaches to physical design tuning have received considerable attention in the recent literature, with a focus on the problem of online index selection. However, it is difficult to draw conclusions on the relative merits of the proposed techniques, as they have been evaluated in isolation using different methodologies. In this paper, we make two(More)