Mixing Materialization and Query Rewriting for Existential Rules

Abstract

Ontology-Based Data Access (OBDA) is a recent paradigm aiming at enhancing data access by taking ontological knowledge into account. When using existential rules as ontological language, query answering is an undecidable problem, whence numerous decidable classes of ontologies have been defined, ranging from classes with very good computational complexities (AC0 in data complexity) to classes with much larger expressivity. However, actually implementable algorithms have been proposed only for very restricted classes (typically those coinciding with lightweight description logics). The aim of this paper is to show how to deal with more expressive ontologies by proposing an algorithm that performs both materialization and rewriting and is applicable for a significant generalization of lightweight description logics. To this end, we first modify an existing algorithm previously proposed for a very generic class of rules, namely greedy bounded treewidth sets of rules. We then exhibit a special case, called pattern oblivious rule sets, which significantly generalizes the ELH description logic, which underlies the OWL 2 EL ontology standard, while keeping the beneficial worst-case computational complexity. We last define a subclass of pattern oblivious rules that is recognizable in polynomial time. 1 Ontology-Based Data Access In the last few years, a novel paradigm for data querying has become increasingly popular in the knowledge representation and reasoning community as well as in the database community. This paradigm is called Ontology-Based Data Access (OBDA). The key idea is to use an ontology to enrich data with domain knowledge, enabling semantic querying. Current research is mainly focusing on conjunctive queries, which are the basic queries in the database community. The considered decision problem is then formalized as follows: letting F be some data (represented as a set of ground atoms and possibly stored in a relational database), O an ontology and q a query, does F ∪ O |= q hold? Depending on the ontology, conjunctive query answering under an ontology can range from undecidable down to AC0 data complexity (which is the same as conjunctive query answering without any ontology). An intense research effort aimed at defining classes of ontologies for which the conjunctive query answering problem is decidable (or even tractable) has thus taken place, resulting in a comprehensive and diversified zoo of decidable classes. In this research effort, two different ontology representation paradigms have been intensely studied: Description Logics [4] and existential rules [5], also known as Datalog+/[7] or tuple-generating dependencies (TGDs) in databases [1]. In Description Logics (DLs), 1 TU Dresden, Germany, email: firstname.lastname@tu-dresden.de current research is focusing on so-called lightweight DLs, most notably from the EL [3] and the DL-Lite [8] families. They provide the logical bases of the tractable profiles OWL 2 EL and OWL 2 QL, respectively, of the OWL ontology language [17]. In existential rules, considered classes are usually more expressive, but also do not have as good computational properties as lightweight description logics. A first approach to design efficient algorithms for OBDA is that of pure query rewriting. The principle is to use the ontology in order to reformulate a query that can be directly evaluated against the original database, which allows (in theory) to make use of good performance of database management systems. This approach is in particular applicable for first-order rewritable ontologies [2, 18, 10, 9, 24, 12, 20] (possibly using Datalog rewritings [11]), but also for EL [19]. An already known weakness of these approaches is the problem of efficiently evaluating the obtained rewritings, in particular when facing huge unions of conjunctive queries. Another trend of research allows to overcome this drawback by materializing (part of) the entailed facts. The most naive approach would be to materialize all the entailed facts, but this is not always possible, since there could be infinitely many. Nonetheless, it is in some case possible to modify the data, and to rewrite the query in such a way that when evaluated against the modified data, it yields sound and complete answers. Such an approach, called a combined approach, has been applied to DL-Lite and to ELH [16, 13, 14, 15]. However, current combined approach algorithms are tailored towards lightweight description logics only. The aim of the current paper is to overcome this shortcoming, by providing such a mixed approach (both modifying the data and the query) that is able to deal with ontologies whose expressivity significantly exceeds that of lightweight description logics. The contribution of the present paper is threefold: • First, we consider the very expressive class of greedy bounded treewidth sets [6]. We argue that the known [22] worst-case optimal algorithm is not efficiently implementable, due to an ad-hoc querying operation. We thus propose to replace this operation by the evaluation of a Datalog program, whose size is polynomial in a parameter of the original algorithm, namely the number of socalled patterns. While this parameter is high in the worst-case, one can expect it to be small in practical cases. Given an efficient Datalog solver, that would enable our algorithm to work efficiently even on large databases. • Second, we define an algorithmically simple class of rules by “reverse engineering”: we look for expressive classes of rules that ensure that the number of relevant patterns is polynomial. We identify such a class which we call pattern oblivious rule sets, which has nice computational properties: query answering is PTIME complete in data complexity and NP-complete in combined complexity under mild restrictions. • Last, we study the computational complexity of recognizing pattern oblivious rules. We show that it is hard for the second level of the polynomial hierarchy, and thus propose another class of rules, namely forward-only rules, that is a particular case of pattern oblivious rules. We show that under mild assumptions, forwardonly rules are recognizable in polynomial time.

DOI: 10.3233/978-1-61499-419-0-897

Extracted Key Phrases

1 Figure or Table

Cite this paper

@inproceedings{Thomazo2014MixingMA, title={Mixing Materialization and Query Rewriting for Existential Rules}, author={Micha{\"{e}l Thomazo and Sebastian Rudolph}, booktitle={ECAI}, year={2014} }