Locality-Aware GC Optimisations for Big Data Workloads

  title={Locality-Aware GC Optimisations for Big Data Workloads},
  author={Duarte Patr{\'i}cio and Rodrigo Bruno and Jos{\'e} Sim{\~a}o and Paulo Ferreira and Lu{\'i}s Veiga},
  booktitle={OTM Conferences},
Many Big Data analytics and IoT scenarios rely on fast and non-relational storage (NoSQL) to help processing massive amounts of data. In addition, managed runtimes (e.g. JVM) are now widely used to support the execution of these NoSQL storage solutions, particularly when dealing with Big Data key-value store-driven applications. The benefits of such runtimes can however be limited by automatic memory management, i.e., Garbage Collection (GC), which does not consider object locality, resulting… 
A Performance Comparison of Modern Garbage Collectors for Big Data Environments
This project aims to understand how different garbage collectors scale in terms of throughput, latency, and memory usage in memory-hungry environments, so that, for given a platform with particular performance needs, the most suitable garbage collection algorithm is mapped.
You Can’t Hide You Can’t Run
A new profiling tool, so called PerfUtil, is developed to study, characterize and better understand why benchmarks have sub-optimal performance on NUMA machines, and its effectiveness is based on its ability to track numerous events throughout the system at the managed runtime system level.
You can’t hide you can’t run: a performance assessment of managed applications on a NUMA machine
PerfUtil is a new profiling tool that assists in demystifying NUMA peculiarities and accurately characterize managed applications profiles, and its effectiveness is based on its ability to track numerous events throughout the system at the managed runtime system level.


A bloat-aware design for big data applications
Experimental results show that this new design paradigm is extremely effective in improving performance --- even for the moderate-size data sets processed, there are 2.5x+ performance gains, and the improvement grows substantially with the size of the data set.
NG2C: pretenuring garbage collection with dynamic generations for HotSpot big data applications
NG2C, a new GC algorithm that combines pretenuring with user-defined dynamic generations, is proposed, which decreases the worst observable GC pause time and avoids object promotion and heap fragmentation both responsible for most of the duration of HotSpot GC pause times.
NumaGiC: a Garbage Collector for Big Data on Big NUMA Machines
NumaGiC, a GC with a mostly-distributed design that improves overall performance and increases the performance of the collector itself by up to 3.6x over NAPS and up to 5.4x over Parallel Scavenge.
FACADE: A Compiler and Runtime for (Almost) Object-Bounded Big Data Applications
A novel compiler framework, called Facade, that can generate highly-efficient data manipulation code by automatically transforming the data path of an existing Big Data application by leading to significantly reduced memory management cost and improved scalability.
Benchmarking cloud serving systems with YCSB
This work presents the "Yahoo! Cloud Serving Benchmark" (YCSB) framework, with the goal of facilitating performance comparisons of the new generation of cloud data serving systems, and defines a core set of benchmarks and reports results for four widely used systems.
Taurus: A Holistic Language Runtime System for Coordinating Distributed Managed-Language Applications
Taurus is a JVM drop-in replacement, requires almost no configuration and can run unmodified off-the-shelf Java applications, and enforces user-defined coordination policies and provides a DSL for writing these policies.
A Checkpointing-enabled and Resource-Aware Java VM for Efficient and Robust e-Science Applications in Grid Environments
This article provides a solution to Java applications with long execution times, by extending a Java VM (Jikes RVM) with mechanisms for checkpointing and migration in ajava VM, to make applications more robust and flexible.
Profile-guided proactive garbage collection for locality optimization
A new system for continuously improving program data locality at run time with low overhead that proactively reorganizes the heap by leveraging the garbage collector and uses profile information collected through a low-overhead mechanism to guide the reorganization atRun time.
Cassandra: a decentralized structured storage system
Cassandra is a distributed storage system for managing very large amounts of structured data spread out across many commodity servers, while providing highly available service with no single point of
Ditto - Deterministic Execution Replayability-as-a-Service for Java VM on Multiprocessors
Ditto is a novel pair of recording and replaying algorithms that employ partial transitive reduction and program-order pruning on-the-fly, and take advantage of TLO static analysis, escape analysis and JVM compiler optimizations to identify thread-local accesses.