Optimizing Space Amplification in RocksDB

@inproceedings{Dong2017OptimizingSA,
  title={Optimizing Space Amplification in RocksDB},
  author={Siying Dong and Mark Callaghan and Leonidas Galanis and Dhruba Borthakur and Tony Savor and Michael Strum},
  booktitle={CIDR},
  year={2017}
}
RocksDB is an embedded, high-performance, persistent keyvalue storage engine developed at Facebook. Much of our current focus in developing and configuring RocksDB is to give priority to resource efficiency instead of giving priority to the more standard performance metrics, such as response time latency and throughput, as long as the latter remain acceptable. In particular, we optimize space efficiency while ensuring read and write latencies meet service-level requirements for the intended… CONTINUE READING

Figures, Results, and Topics from this paper.

Key Quantitative Results

  • We show by way of empirical measurements that RocksDB requires roughly 50% less storage space than InnoDB, on average; it also has a higher transaction throughput, and yet it increases read latencies only marginally, remaining well within the margins of acceptability.
  • Depending on the composition of the data, weaker compression algorithms can reduce space requirements down to as low as 40%, and stronger algorithms down to as low as 25%, of their original sizes on production Facebook data.
  • We described how RocksDB was able to reduce storage space requirements by 50% over what InnoDB would need, while at the same time increasing transaction throughput and significantly decreasing write amplification, yet increasing average read latencies by a marginal amount.

Citations

Publications citing this paper.
SHOWING 1-10 OF 36 CITATIONS

MBWU: Benefit Quantification for Data Access Function Offloading

Jianshen Liu, Philip Kufeldt, Carlos Maltzahn
  • ArXiv
  • 2019
VIEW 4 EXCERPTS
HIGHLY INFLUENCED

LSM-based storage techniques: a survey

  • The VLDB Journal
  • 2018
VIEW 4 EXCERPTS
CITES METHODS & BACKGROUND
HIGHLY INFLUENCED

MV-FTL: An FTL That Provides Page-Level Multi-Version Management

  • IEEE Transactions on Knowledge and Data Engineering
  • 2018
VIEW 10 EXCERPTS
CITES METHODS & BACKGROUND
HIGHLY INFLUENCED

X-Engine: An Optimized Storage Engine for Large-scale E-commerce Transaction Processing

  • SIGMOD Conference
  • 2019
VIEW 3 EXCERPTS
CITES METHODS
HIGHLY INFLUENCED

CAPI-Flash Accelerated Persistent Read Cache for Apache Cassandra

  • 2018 IEEE 11th International Conference on Cloud Computing (CLOUD)
  • 2018
VIEW 4 EXCERPTS
CITES BACKGROUND
HIGHLY INFLUENCED

References

Publications referenced by this paper.
SHOWING 1-10 OF 28 REFERENCES