David DeHaan

Learn More
The W3C XQuery language recommendation, based on a hierarchical and ordered document model, supports a wide variety of constructs and use cases. There is a diversity of approaches and strategies for evaluating XQuery expressions, in many cases only dealing with limited subsets of the language. In this paper we describe an implementation approach that(More)
Most contemporary database systems perform cost-based join enumeration using some variant of System-R's bottom-up dynamic programming method. The notable exceptions are systems based on the top-down transformational search of Volcano/Cascades. As recent work has demonstrated, bottom-up dynamic programming can attain optimality with respect to the shape of(More)
Internet traffic patterns are believed to obey the power law, implying that most of the bandwidth is consumed by a small set of heavy users. Hence, queries that return a list of frequently occurring items are important in the analysis of real-time Internet packet streams. While several results exist for computing frequent item queries using limited memory(More)
The basic idea of our algorithm is to divide the sliding window into n equally-sized, contiguous partitions (called basic windows in [3]), with each partition storing only the identities of those item types which occur with a relative frequency of at least 1/m within this particular basic window. Rather than advancing the window whenever a new tuple(More)
Histograms that guarantee a maximum multiplicative error (q-error) for estimates may significantly improve the plan quality of query optimizers. However, the construction time for histograms with maximum q-error was too high for practical use cases. In this paper we extend this concept with a threshold, i.e., an estimate or true cardinality θ, below(More)
Query plan caching eliminates the need for repeated query optimization, hence, it has strong practical implications for relational database management systems (RDBMSs). Unfortunately, existing approaches consider only the query plan generated at the expected values of parameters that characterize the query, data and the current state of the system, while(More)
Appropriately selected materialized views (also called <i>indexed</i> views) can speed up query execution by orders of magnitude. Most database systems limit support for materialized views to select-project-join expressions, possibly with a group-by, over base tables because this class of views can be efficiently maintained incrementally and thus kept up to(More)
This paper presents an application of a DL reasoner to the optimization of an object-relational query language. Queries containing aggregate functions are difficult to optimize because care must be taken to guarantee that the output value of the aggregate function is not affected. We present a mapping from an objectrelational aggregate query to a DL(More)
We consider the problem of deciding query equivalence for a conjunctive language in which queries output complex objects composed from a mixture of nested, unordered collection types. Using an encoding of nested objects as flat relations, we translate the problem to deciding the equivalence between encodings output by relational conjunctive queries. This(More)
Queries that return a list of frequently occurring items are popular in the analysis of data streams such as real-time Internet traÆc logs. While several results exist for computing frequent item queries using limited memory in the in nite stream model, none have been extended to the limited-memory sliding window model, which considers only the last N items(More)