XPath Node Selection over Grammar-Compressed Trees

  title={XPath Node Selection over Grammar-Compressed Trees},
  author={Sebastian Maneth and Tom Sebastian},
XML document markup is highly repetitive and therefore well compressible using grammar-based compression. Downward, navigational XPath can be executed over grammar-compressed trees in PTIME: the query is translated into an automaton which is executed in one pass over the grammar. This result is well-known and has been mentioned before. Here we present precise bounds on the time complexity of this problem, in terms of big-O notation. For a given grammar and XPath query, we consider three… Expand
Traversing Grammar-Compressed Trees with Constant Delay
A grammar-compressed ranked tree is represented with a linear space overhead so that a single traversal step, i.e., the move to the parent or the ith child, can be carried out in constant time. TheExpand
Constant-Time Tree Traversal and Subtree Equality Check for Grammar-Compressed Trees
A linear space data structure for grammar-compressed trees is presented which allows to carry out tree traversal operations and subtree equality checks in constant time. A traversal step consists ofExpand
of the Workshop Workshop on Trends in Tree Automata and Tree Transducers
We propose an algorithm for computing the N best roots of a weighted hypergraph, in which the weight function is given over an idempotent and multiplicatively monotone semiring. We give a set ofExpand


Path Queries on Compressed XML
This paper demonstrates in this paper that the tree structure can be effectively compressed and manipulated using techniques derived from symbolic model checking and shows first that succinct representations of document tree structures based on sharing subtrees are highly effective and second that compressed structures can be queried directly and efficiently. Expand
Fast and Tiny Structural Self-Indexes for XML
A fully-fledged index over grammar-compressed trees, used before as synopsis for structural XPath queries, is presented and allows to execute arbitrary tree algorithms with a slow-down that is comparable to the space improvement. Expand
XPath whole query optimization
It is shown that tree automata can be used as a general framework for fine grained XML query optimization and efficiently approximate runs over relevant nodes by means of on-the-fly removal of alternation and non-determinism of (alternating) tree Automata. Expand
Query evaluation on compressed trees
This paper proposes a new automata-theoretic formalism for querying trees and gives algorithms for evaluating queries defined by such automata, including XPath and monadic datalog queries. Expand
The complexity of tree automata and XPath on grammar-compressed trees
The complexity of various membership problems for tree automata on compressed trees is analyzed and the evaluation problem for (structural) XPath queries on trees that are compressed via straight-line context-free tree grammars is investigated. Expand
Fast in-memory XPath search using compressed indexes
The SXSI system performs on par or better than the fastest known systems MonetDB and Qizx on pure tree queries, and on queries that use text search, SXSI outperforms the existing systems by 1-3 orders of magnitude. Expand
XML tree structure compression using RePair
A new linear time algorithm for computing small SLCF tree grammars, called TreeRePair, is presented and it is shown that it greatly outperforms the best known previous algorithm BPLEX. Expand
Efficient memory representation of XML document trees
A technique is presented that allows to represent the tree structure of an XML document in an efficient way by compressing their tree structure, and the functionality of basic tree operations, like traversal along edges, is preserved under this compressed representation. Expand
Processing XML streams with deterministic automata and stream indexes
The DFA can be used effectively for evaluating a large number of XPath expressions on a stream of XML packets and a series of theoretical results and experimental evaluations show that the lazy DFA has a small number of states, for all practical purposes. Expand
Query automata over finite trees
This work defines a query automaton (QA) as a deterministic two-way finite automaton over trees that has the ability to select nodes depending on the state and the label at those nodes, and establishes the complexity of the non-emptiness, containment, and equivalence of QAs to be complete for EXPTIME. Expand