Michael J. Carey

Learn More
While standardization efforts for XML query languages have been progressing, researchers and users increasingly focus on the database technology that has to deliver on the new challenges that the abundance of XML documents poses to data management: validation, performance evaluation and optimization of XML query processors are the upcoming issues. Following(More)
With standardization e orts of a query language for XML documents drawing to a close, researchers and users increasingly focus their attention on the database technology that has to deliver on the new challenges that the sheer amount of XML documents produced by applications pose to data management: validation, performance evaluation and optimization of XML(More)
The OO7 Benchmark represents a comprehensive test of OODBMS performance. In this paper we describe the benchmark and present performance results from its implementation in three OODBMS systems. It is our hope that the OO7 Benchmark will provide useful insight for end-users evaluating the performance of OODBMS systems; we also hope that the research(More)
SHORE (Scalable Heterogeneous Object REpository) is a persistent object system under development at the University of Wisconsin. SHORE represents a merger of object-oriented database and file system technologies. In this paper we give the goals and motivation for SHORE, and describe how SHORE provides features of both technologies. We also describe some(More)
In this paper we study how to efficiently perform set-similarity joins in parallel using the popular MapReduce framework. We propose a 3-stage approach for end-to-end set-similarity joins. We take as input a set of records and output a set of joined records based on a set-similarity condition. We efficiently partition the data across nodes in order to(More)
Hyracks is a new partitioned-parallel software platform designed to run data-intensive computations on large shared-nothing clusters of computers. Hyracks allows users to express a computation as a DAG of data operators and connectors. Operators operate on partitions of input data and produce partitions of output data, while connectors repartition(More)
The HiPAC (High Performance ACtive database system) project addresses two critical problems in time-constrained data management: the handling of timing constraints in databases, and the avoidance of wasteful polling through the use of situation-action rules that are an integral part of the database and are monitored by DBMS's condition monitor. A rich(More)
One approach to achieving high performance in a database management system is to store the database in main memorv rather than on disk. -One can then design new data structures aid algorithms oriented towards making eflicient use of CPU cycles and memory space rather than minimizing disk accesses and &ing disk space efliciently. In this paper we present(More)
A number of recent studies have examined the performance of concurrency control algorithms for database management systems. The results reported to date, rather than being definitive, have tended to be contradictory. In this paper, rather than presenting “yet another algorithm performance study,” we critically investigate the assumptions made in(More)
For reasons of simplicity and communication efficiency, a number of existing object-oriented database management systems are based on page server architectures; data pages are their minimum unit of transfer and client caching. Despite their efficiency, page servers are often criticized as being too restrictive when it comes to concurrency, as existing(More)