• Publications
  • Influence
A comparison of approaches to large-scale data analysis
TLDR
A benchmark consisting of a collection of tasks that are run on an open source version of MR as well as on two parallel DBMSs shows a dramatic performance difference between the two paradigms.
H-store: a high-performance, distributed main memory transaction processing system
TLDR
The demonstration presented here provides insight on the development of a distributed main memory OLTP database and allows for the further study of the challenges inherent in this operating environment.
OLTP-Bench: An Extensible Testbed for Benchmarking Relational Databases
TLDR
OLTP-Bench is presented, an extensible "batteries included" DBMS benchmarking testbed with its ease of use and extensibility, support for tight control of transaction mixtures, request rates, and access distributions over time, as well as the ability to support all major DBMSs and DBaaS platforms.
Automatic Database Management System Tuning Through Large-scale Machine Learning
TLDR
An automated approach that leverages past experience and collects new information to tune DBMS configurations and recommends configurations that are as good as or better than ones generated by existing tools or a human expert is presented.
Skew-aware automatic database partitioning in shared-nothing, parallel OLTP systems
TLDR
A novel approach to automatically partitioning databases for enterprise-class OLTP systems that significantly extends the state of the art by minimizing the number distributed transactions, while concurrently mitigating the effects of temporal skew in both the data distribution and accesses is presented.
MapReduce and parallel DBMSs: friends or foes?
MapReduce complements DBMSs since databases are not designed for extract-transform-load tasks, a MapReduce specialty.
Staring into the Abyss: An Evaluation of Concurrency Control with One Thousand Cores
TLDR
It is concluded that rather than pursuing incremental solutions, many-core chips may require a completely redesigned DBMS architecture that is built from ground up and is tightly coupled with the hardware.
TicToc: Time Traveling Optimistic Concurrency Control
TLDR
TicToc is presented, a new optimistic concurrency control algorithm that avoids the scalability and concurrency bottlenecks of prior T/O schemes and achieves up to 92% better throughput while reducing the abort rate by 3.3x over these previous algorithms.
Let's Talk About Storage & Recovery Methods for Non-Volatile Memory Database Systems
TLDR
This work implemented three engines in a modular DBMS testbed that are based on different storage management architectures, and presents NVM-aware variants of these architectures that leverage the persistence and byte-addressability properties of NVM in their storage and recovery methods.
Bridging the Archipelago between Row-Stores and Column-Stores for Hybrid Workloads
TLDR
A hybrid DBMS architecture that efficiently supports varied workloads on the same database and a technique to continuously evolve the database's physical storage layout by analyzing the queries' access patterns and choosing the optimal layout for different segments of data within the same table is presented.
...
...