• Publications
  • Influence
Mondrian Multidimensional K-Anonymity
TLDR
A new multidimensional model is proposed, which provides an additional degree of flexibility not seen in previous (single-dimensional) approaches, which leads to higher-quality anonymizations, as measured both by general-purpose metrics and more specific notions of query answerability.
Incognito: efficient full-domain K-anonymity
TLDR
A set of algorithms for producing minimal full-domain generalizations are introduced, and it is shown that these algorithms perform up to an order of magnitude faster than previous algorithms on two real-life databases.
Relational Databases for Querying XML Documents: Limitations and Opportunities
TLDR
It turns out that the relational approach can handle most (but not all) of the semantics of semi-structured queries over XML data, but is likely to be effective only in some cases.
A comparison of approaches to large-scale data analysis
TLDR
A benchmark consisting of a collection of tasks that are run on an open source version of MR as well as on two parallel DBMSs shows a dramatic performance difference between the two paradigms.
On supporting containment queries in relational database management systems
TLDR
The results suggest that contrary to most expectations, with some modifications, a native implementations in an RDBMS can support this class of query much more efficiently.
NiagaraCQ: a scalable continuous query system for Internet databases
TLDR
The design of NiagaraCQ is presented, some experimental results on the system's performance and scalability are given and other techniques including incremental evaluation of continuous queries, use of both pull and push models for detecting heterogeneous data source changes, and memory caching are employed.
Partition based spatial-merge join
TLDR
PBSM (Partition Based Spatial-Merge), a new algorithm for performing spatial join operation that is especially effective when neither of the inputs to the join have an index on the joining attribute, is described.
Parallel database systems: the future of high performance database systems
TLDR
Over the last decade 'Eradata, Tandem, and a host of startup companies have successfully developed and marketed highly parallel machines that refutes a 1983 paper predicting the demise of database machines.
DBMSs on a Modern Processor: Where Does Time Go?
TLDR
This paper examines four commercial DBMSs running on an Intel Xeon and NT 4.0 and introduces a framework for analyzing query execution time, and finds that database developers should not expect the overall execution time to decrease significantly without addressing stalls related to subtle implementation issues.
Weaving Relations for Cache Performance
TLDR
This paper proposes a new data organization model called PAX (Partition Attributes Across), that significantly improves cache performance by grouping together all values of each attribute within each page, and demonstrates that in-page data placement is the key to high cache performance.
...
1
2
3
4
5
...