• Corpus ID: 6984037

Gang-GC: Locality-aware Parallel Data Placement Optimizations for Key-Value Storages

  title={Gang-GC: Locality-aware Parallel Data Placement Optimizations for Key-Value Storages},
  author={Duarte Patr{\'i}cio and Jos{\'e} Sim{\~a}o and Lu{\'i}s Veiga},
Many cloud applications rely on fast and non-relational storage to aid in the processing of large amounts of data. Managed runtimes are now widely used to support the execution of several storage solutions of the NoSQL movement, particularly when dealing with big data key-value store-driven applications. The benefits of these runtimes can however be limited by modern parallel throughput-oriented GC algorithms, where related objects have the potential to be dispersed in memory, either in the… 


NumaGiC: a Garbage Collector for Big Data on Big NUMA Machines
NumaGiC, a GC with a mostly-distributed design that improves overall performance and increases the performance of the collector itself by up to 3.6x over NAPS and up to 5.4x over Parallel Scavenge.
A bloat-aware design for big data applications
Experimental results show that this new design paradigm is extremely effective in improving performance --- even for the moderate-size data sets processed, there are 2.5x+ performance gains, and the improvement grows substantially with the size of the data set.
Profile-guided proactive garbage collection for locality optimization
A new system for continuously improving program data locality at run time with low overhead that proactively reorganizes the heap by leveraging the garbage collector and uses profile information collected through a low-overhead mechanism to guide the reorganization atRun time.
FACADE: A Compiler and Runtime for (Almost) Object-Bounded Big Data Applications
A novel compiler framework, called Facade, that can generate highly-efficient data manipulation code by automatically transforming the data path of an existing Big Data application by leading to significantly reduced memory management cost and improved scalability.
Effective “static-graph” reorganization to improve locality in garbage-collected systems
Improved measures of static locality indicate that heap data can be cheaply and effectively compressed, and this may allow more effective paging and prefetching strategies; a level of “compressed in-RAM storage” is suggested, with price and performance between those of RAM and disk.
Taurus: A Holistic Language Runtime System for Coordinating Distributed Managed-Language Applications
Taurus is a JVM drop-in replacement, requires almost no configuration and can run unmodified off-the-shelf Java applications, and enforces user-defined coordination policies and provides a DSL for writing these policies.
A study of the scalability of stop-the-world garbage collectors on multicores
This study suggests that the default throughput-oriented garbage collector of OpenJDK 7, called Parallel Scavenge, has bottlenecks, and is identified, and it is shown how to eliminate them using well-established parallel programming techniques.
How Data Volume Affects Spark Based Data Analytics on a Scale-up Server
This analysis reveals that Spark based data analytics are DRAM bound and do not benefit by using more than 12 cores for an executor, and matches memory behaviour with the garbage collector to improve performance of applications between 1.6x to 3x.
Cassandra: a decentralized structured storage system
Cassandra is a distributed storage system for managing very large amounts of structured data spread out across many commodity servers, while providing highly available service with no single point of
Benchmarking cloud serving systems with YCSB
This work presents the "Yahoo! Cloud Serving Benchmark" (YCSB) framework, with the goal of facilitating performance comparisons of the new generation of cloud data serving systems, and defines a core set of benchmarks and reports results for four widely used systems.