# Worst-Case Optimal Join Algorithms: Techniques, Results, and Open Problems

@article{Ngo2018WorstCaseOJ, title={Worst-Case Optimal Join Algorithms: Techniques, Results, and Open Problems}, author={Hung Quoc Ngo}, journal={Proceedings of the 37th ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems}, year={2018} }

Worst-case optimal join algorithms are the class of join algorithms whose runtime match the worst-case output size of a given join query. While the first provably worse-case optimal join algorithm was discovered relatively recently, the techniques and results surrounding these algorithms grow out of decades of research from a wide range of areas, intimately connecting graph theory, algorithms, information theory, constraint satisfaction, database theory, and geometric inequalities. These ideas…

## 35 Citations

Index-Structures for Worst-Case Optimal Join Algorithms

- Computer Science
- 2020

This work developed two new variants of the Leapfrog Triejoin and introduced and evaluate two index-structures, and discusses the strengths and limitations of the join algorithms and their index- Structures.

Optimal Join Algorithms Meet Top-k

- Computer ScienceSIGMOD Conference
- 2020

It is argued that the two areas of optimal join algorithms and ranked enumeration can and should be studied from a unified point of view in order to achieve optimality in the common model of computation for a very general class of top-k-style join queries.

Domain Ordering and Box Cover Problems for Beyond Worst-Case Join Processing

- Computer Science, Mathematics
- 2019

This thesis defines several optimization problems over the space of domain orderings where the objective is to minimize the size of either the minimum box certificate or the Minimum box cover for the given input query and provides approximation algorithms for several of these problems.

A Worst-Case Optimal Join Algorithm for SPARQL

- Computer ScienceSEMWEB
- 2019

This paper proposes a novel procedure for evaluating SPARQL queries based on an existing worst-case join algorithm called Leapfrog Triejoin, and proposes and implements an adaptation of this algorithm, and shows that with this new join algorithm, Apache Jena often runs orders of magnitude faster than the base version and two other SParQL engines: Virtuoso and Blazegraph.

Optimal Joins using Compact Data Structures

- Computer ScienceICDT
- 2020

It is shown that optimal algorithms can be obtained directly from a representation that regards the relations as point sets in variable-dimensional grids, without the need of extra storage, and developed a compositional algorithm to process full join queries under this representation.

Worst-Case Optimal Graph Joins in Almost No Space

- Computer Science, MathematicsSIGMOD Conference
- 2021

An indexing scheme that supports worst-case optimal graph joins in almost no space beyond storing the graph itself and offers the best overall performance for query times while using only a small fraction of the space when compared with several state-of-the-art approaches.

RapidMatch: A Holistic Approach to Subgraph Query Processing

- Computer ScienceProc. VLDB Endow.
- 2020

This paper proves that the complexity of result enumeration in state-of-the-art exploration-based methods matches that of the worst-case optimal join and proposes RapidMatch, a holistic subgraph query processing framework integrating the two approaches.

Optimal Joins using Compressed Quadtrees

- Computer ScienceACM Transactions on Database Systems
- 2022

It is shown that worst-case optimal algorithms can be obtained directly from a representation that regards the relations as point sets in variable-dimensional grids, without the need of any significant extra storage, and a compositional algorithm to process full join queries is developed.

Degree Sequence Bound For Join Cardinality Estimation

- Computer ScienceArXiv
- 2022

This work proves a novel bound called the Degree Sequence Bound which takes into account the full degree sequences and the max tuple multiplicity on Berge-Acyclic queries, and describes how to practically compute this bound using a functional approximation of the true degree sequences.

2 Counting Triangles under Updates in Worst-Case Optimal Time 1

- Computer Science
- 2019

An approach is introduced that exhibits a space- time tradeoff such that the space-time product is quadratic in the size of the input database and the update time can be as low as the square root of this size.

## References

SHOWING 1-10 OF 78 REFERENCES

Skew strikes back: new developments in the theory of join algorithms

- Computer ScienceSGMD
- 2014

A survey of recent work on join algorithms that have provable worst-case optimality runtime guarantees is described and a simpler and unified description of these algorithms is provided that is useful for theory-minded readers, algorithm designers, and systems implementors.

Towards a Worst-Case I/O-Optimal Algorithm for Acyclic Joins

- Computer SciencePODS
- 2016

This paper is able to prove that the "triangle query" algorithm is I/O-optimal for certain classes of acyclic joins without deriving its bound explicitly.

Join Processing for Graph Patterns: An Old Dog with New Tricks

- Computer ScienceGRADES@SIGMOD/PODS
- 2015

It is found that classical relational databases like Postgres and MonetDB or newer graph databases/stores like Virtuoso and Neo4j may be orders of magnitude slower than these new approaches compared to a fully featured RDBMS, LogicBlox, using these new ideas.

Worst-Case Optimal Algorithms for Parallel Query Processing

- Computer ScienceICDT
- 2016

This paper studies the communication complexity for the problem of computing a conjunctive query on a large database in a parallel setting with $p$ servers, and shows a surprising connection to the external memory model, which allows us to translate parallel algorithms to external memory algorithms.

Triejoin: A Simple, Worst-Case Optimal Join Algorithm

- Computer ScienceICDT
- 2014

It is established that leapfrog triejoin is also worst-case optimal, up to a log factor, in the sense of NPRR.

Size Bounds and Query Plans for Relational Joins

- Computer Science2008 49th Annual IEEE Symposium on Foundations of Computer Science
- 2008

This work studies relational joins from a theoretical perspective and shows that there exist queries for which the join-project plan suggested by the fractional edge cover approach may be substantially better than any join plan that does not use intermediate projections.

A Worst-Case Optimal Multi-Round Algorithm for Parallel Computation of Conjunctive Queries

- Computer SciencePODS
- 2017

A multi-round algorithm is described that computes any query with load m/p^(1/rho*) per server, in the case when all input relations are binary, which is proved to be the optimal load for all queries over binary input relations.

Distributed Evaluation of Subgraph Queries Using Worst-case Optimal and Low-Memory Dataflows

- Computer ScienceProc. VLDB Endow.
- 2018

This work presents the first approach that performs worst-case optimal computation and communication, maintains a total memory footprint linear in the number of input edges, and scales down per-worker computation, communication, and memory requirements linearly as thenumber of workers increases, even on adversarially skewed inputs.

Joins via Geometric Resolutions: Worst-case and Beyond

- Computer Science, MathematicsPODS
- 2015

An algorithm is designed that achieves the fractional hypertree-width bound, which generalizes classical and recent worst-case algorithmic results on computing joins and uses the framework and the same algorithm to show a series of what are colloquially known as beyond worst- case results.

Beyond worst-case analysis for joins with minesweeper

- Computer Science, MathematicsPODS
- 2014

A new algorithm is described, Minesweeper, that is able to satisfy stronger runtime guarantees than previous join algorithms (colloquially ``beyond worst-case'' guarantees) for data in indexed search trees and a dichotomy theorem is developed for the certificate-based notion of complexity.