Learn More
Dwarf is a highly compressed structure for computing, storing, and querying data cubes. Dwarf identifies prefix and suffix structural redundancies and factors them out by coalescing their store. Prefix redundancy is high on dense areas of cubes but suffix redundancy is significantly higher for sparse areas. Putting the two together fuses the exponential(More)
Earlier work has demonstrated the effectiveness of in-network data aggregation in order to minimize the amount of messages exchanged during continuous queries in large sensor networks. The key idea is to build an aggregation tree, in which parent nodes aggregate the values received from their children. Nevertheless , for large sensor networks with severe(More)
Rank-aware query processing has become essential for many applications that return to the user only the top-k objects based on the individual user's preferences. Top-k queries have been mainly studied from the perspective of the user, focusing primarily on efficient query processing. In this work, for the first time, we study top-k queries from the(More)
Skyline query processing has received considerable attention in the recent past. Mainly, the skyline query is used to find a set of non dominated data points in a multidimensional dataset. While most previous work has assumed a centralized setting, in this paper we address the efficient computation of subspace skyline queries in large-scale peer-to-peer(More)
Association Rule Mining algorithms operate on a data matrix (e.g., customers products) to derive association rules 2, 23]. We propose a new paradigm, namely, Ratio Rules, which are quantiiable in that we can measure the \goodness" of a set of discovered rules. We propose to use the \guessing error" as a measure of the \goodness", that is, the(More)
We present techniques for computing small space representations of massive data streams. These are inspired by traditional wavelet-based approximations that consist of specific linear projections of the underlying data. We present general " sketch " based methods for capturing various linear projections of the data and use them to provide pointwise and(More)
We are inevitably moving into a realm where small and inexpensive wireless devices would be seamlessly embedded in the physical world and form a wireless sensor network in order to perform complex monitoring and computational tasks. Such networks pose new challenges in data processing and dissemination because of the limited resources (processing,(More)
Materialized aggregate views represent a set of redundant entities in a data warehouse that are frequently used to accelerate On-Line Analytical Processing (OLAP). Due to the complex structure of the data warehouse and the different profiles of the users who submit queries, there is need for tools that will automate and ease the view selection and(More)
Association Rule Mining algorithms operate on a data matrix (e.g., customers products) to derive rules 2, 22]. We propose a single-pass algorithm for mining linear rules in such a matrix based on Principal Component Analysis. PCA detects correlated columns of the matrix, which correspond to, e.g., products that sell together. The rst contribution of this(More)
Pre-computation and materialization of views with aggregate functions is a common technique in Data Warehouses. Due to the complex structure of the warehouse and the different profiles of the users who submit queries, there is need for tools that will automate the selection and management of the materialized data. In this paper we present DynaMat, a system(More)