Learn More
The rich dependency structure found in the columns of real-world relational databases can be exploited to great advantage, but can also cause query optimizers---which usually assume that columns are statistically independent---to underestimate the selectivities of conjunctive predicates by orders of magnitude. We introduce CORDS, an efficient and scalable(More)
Identification of (composite) key attributes is of fundamental importance for many different data management tasks such as data modeling, data integration, anomaly detection, query formulation, query optimization, and indexing. However, information about keys is often missing or incomplete in many real-world database scenarios. Surprisingly, the fundamental(More)
Damia is a lightweight enterprise data integration service where line of business users can create and catalog high value data feeds for consumption by situational applications. Damia is inspired by the Web 2.0 mashup phenomenon. It consists of (1) a browser-based user-interface that allows for the specification of data mashups as data flow graphs using a(More)
We present the BHUNT scheme for automatically discovering algebraic constraints between pairs of columns in relational data. The constraints may be " fuzzy " in that they hold for most, but not all, of the records, and the columns may be in the same table or different tables. Such constraints are of interest in the context of both data mining and query(More)
UK local governments have invested heavily in ICT in recent years to improve public service delivery. Most local governments now operate contact centres and websites to exchange information and transactions with citizens. But the aspirations of central government go much further - to service "transformation" - and the expectation that citizens and(More)
— In this paper, we propose a new benchmark for scientific data management systems called SS-DB. This benchmark , loosely modeled on an astronomy workload, is intended to simulate applications that manipulate array-oriented data through relatively sophisticated user-defined functions. SS-DB is representative of the processing performed in a number of(More)