• Publications
  • Influence
XMark: A Benchmark for XML Data Management
While standardization efforts for XML query languages have been progressing, researchers and users increasingly focus on the database technology that has to deliver on the new challenges that theExpand
  • 862
  • 112
The XML benchmark project
With standardization efforts of a query language for XML documents drawing to a close, researchers and users increasingly focus their attention on the database technology that has to deliver on theExpand
  • 310
  • 45
Efficient parallel set-similarity joins using MapReduce
In this paper we study how to efficiently perform set-similarity joins in parallel using the popular MapReduce framework. We propose a 3-stage approach for end-to-end set-similarity joins. We take asExpand
  • 465
  • 44
Shoring up persistent applications
SHORE (Scalable Heterogeneous Object REpository) is a persistent object system under development at the University of Wisconsin. SHORE represents a merger of object-oriented database and file systemExpand
  • 432
  • 30
Hyracks: A flexible and extensible foundation for data-intensive computing
Hyracks is a new partitioned-parallel software platform designed to run data-intensive computations on large shared-nothing clusters of computers. Hyracks allows users to express a computation as aExpand
  • 257
  • 25
Concurrency control performance modeling: alternatives and implications
A number of recent studies have examined the performance of concurrency control algorithms for database management systems. The results reported to date, rather than being definitive, have tended toExpand
  • 361
  • 25
A Study of Index Structures for a Main Memory Database Management System
One approach to achieving high performance in a database management system is to store the database in main memorv rather than on disk. -One can then design new data structures aid algorithmsExpand
  • 269
  • 25
Efficiently publishing relational data as XML documents
Abstract. XML is rapidly emerging as a standard for exchanging business data on the World Wide Web. For the foreseeable future, however, most business data will continue to be stored in relationalExpand
  • 352
  • 22
Transactional client-server cache consistency: alternatives and performance
Client-server database systems based on a data shipping model can exploit client memory resources by caching copies of data items across transaction boundaries. Caching reduces the need to obtainExpand
  • 203
  • 22
The HiPAC project: combining active databases and timing constraints
The HiPAC (High Performance ACtive database system) project addresses two critical problems in time-constrained data management: the handling of timing constraints in databases, and the avoidance ofExpand
  • 405
  • 21