Benchmarking cloud serving systems with YCSB

@inproceedings{Cooper2010BenchmarkingCS,
  title={Benchmarking cloud serving systems with YCSB},
  author={Brian F. Cooper and Adam Silberstein and Erwin Tam and Raghu Ramakrishnan and Russell Sears},
  booktitle={SoCC '10},
  year={2010}
}
While the use of MapReduce systems (such as Hadoop) for large scale data analysis has been widely recognized and studied, we have recently seen an explosion in the number of systems developed for cloud data serving. These newer systems address "cloud OLTP" applications, though they typically do not support ACID transactions. Examples of systems proposed for cloud serving use include BigTable, PNUTS, Cassandra, HBase, Azure, CouchDB, SimpleDB, Voldemort, and many others. Further, they are being… Expand
Summary: Benchmarking Cloud Serving Systems with YCSB
Cloud storage systems represent a new paradigm in distributed data storage. In contrast to traditional ACID-based storage, these systems focus on simple, flexible record types, extremely highExpand
Comparison of Database and Workload Types Performance in Cloud Environments
TLDR
This paper investigates how such offerings MongoDB, Cassandra and HBase behave when deployed in virtual environments of the BONFIRE facility and how they are measured against widely used benchmarks such as YCSB. Expand
A batch of PNUTS: experiences connecting cloud batch and serving systems
TLDR
This paper discusses the experience of running batch-oriented Hadoop on top of Yahoo's serving-oriented PNUTS system instead of the standard HDFS file system, and introduces a batch write path to improve latency-insensitive write throughput to PnUTS. Expand
Performance Scaling of Cassandra on High-Thread Count Servers
TLDR
This paper describes the experiences studying the performance scaling characteristics of Cassandra, a popular open-source, column-oriented database, on a single high-thread count dual socket server, and shows how by taking into account specific knowledge of the underlying topology of the server architecture, it can achieve substantial improvements in performance scalability. Expand
OLTP-Bench: An Extensible Testbed for Benchmarking Relational Databases
TLDR
OLTP-Bench is presented, an extensible "batteries included" DBMS benchmarking testbed with its ease of use and extensibility, support for tight control of transaction mixtures, request rates, and access distributions over time, as well as the ability to support all major DBMSs and DBaaS platforms. Expand
Performance Evaluation of Range Queries in Key Value Stores
TLDR
This paper compares three widely-used No-SQL systems: Cassandra, HBase and Voldemort in terms of their support for different types of query workloads, and shows that there are trade-offs in the performance of the selected system and scheme, and the types of thequery workloads that can be processed efficiently. Expand
Towards realistic benchmarking for cloud file systems: Early experiences
  • Zujie Ren, W. Shi, Jian Wan
  • Computer Science
  • 2014 IEEE International Symposium on Workload Characterization (IISWC)
  • 2014
TLDR
A two-week I/O workload trace from a 2,500-node production cluster is collected and several interesting implications are derived on guiding system researchers and engineers to build a realistic benchmark on their own systems. Expand
Performance Evaluation of NoSQL Systems using YCSB in a Resource Austere Environment
TLDR
The Yahoo! Cloud Serving Benchmark (YCSB) framework is presented, with the goal of facilitating performance comparisons of the new generation of NoSQL databases in an environment where resources are limited. Expand
ElasTraS: An elastic, scalable, and self-managing transactional database for the cloud
TLDR
ElasTraS leverages Albatross, a low overhead on-demand live database migration technique, for elastic load balancing by adding more servers during high load and consolidating to fewer servers during usage troughs, which minimizes the operating cost and ensures good performance even in the presence of unpredictable changes to the workload. Expand
Performance Analysis of NoSQL Systems Under Different Workloads
NoSQL database systems are becoming a widely used data platform for big data applications. In order to make the best choice among different NoSQL solutions, it is necessary to analyze the weaknessesExpand
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 29 REFERENCES
A comparison of approaches to large-scale data analysis
TLDR
A benchmark consisting of a collection of tasks that are run on an open source version of MR as well as on two parallel DBMSs shows a dramatic performance difference between the two paradigms. Expand
Cassandra: structured storage system on a P2P network
Cassandra is a distributed storage system for managing structured data that is designed to scale to a very large size across many commodity servers, with no single point of failure. Reliability atExpand
Dynamo: amazon's highly available key-value store
TLDR
D Dynamo is presented, a highly available key-value storage system that some of Amazon's core services use to provide an "always-on" experience and makes extensive use of object versioning and application-assisted conflict resolution in a manner that provides a novel interface for developers to use. Expand
Rose: compressed, log-structured replication
TLDR
A page compression format is introduced that takes advantage of LSM-tree's sequential, sorted data layout and increases replication throughput by reducing sequential I/O, and enables efficient tree lookups by supporting small page sizes and doubling as an index of the values it stores. Expand
PNUTS: Yahoo!'s hosted data serving platform
TLDR
PNUTS provides data storage organized as hashed or ordered tables, low latency for large numbers of concurrent requests including updates and queries, and novel per-record consistency guarantees and utilizes automated load-balancing and failover to reduce operational complexity. Expand
Bigtable: A Distributed Storage System for Structured Data
TLDR
The simple data model provided by Bigtable is described, which gives clients dynamic control over data layout and format, and the design and implementation of Bigtable are described. Expand
Cutting Corners: Workbench Automation for Server Benchmarking
TLDR
Experimental results show how an automated workbench controller can plan and coordinate the benchmark runs to obtain a result with a target threshold of confidence and accuracy at lower cost than scripted approaches that are commonly practiced. Expand
XMark: A Benchmark for XML Data Management
TLDR
This work provides a framework to assess the abilities of an XML database to cope with a broad range of different query types typically encountered in real-world scenarios and offers a set of queries where each query is intended to challenge a particular aspect of the query processor. Expand
C-Store: A Column-oriented DBMS
TLDR
Preliminary performance data on a subset of TPC-H is presented and it is shown that the system the team is building, C-Store, is substantially faster than popular commercial products. Expand
Quickly generating billion-record synthetic databases
TLDR
Several database generation techniques are presented, in terms of generating billion-record SQL databases using C programs running on a shared-nothing computer system consisting of a hundred processors, with a thousand discs. Expand
...
1
2
3
...