Learn More
Scientists are increasingly using the current state of the art big data analytic software (e.g., Hadoop, Giraph, etc.) for their data-intensive applications over HPC environment. However, understanding and designing the hardware environment that these data- and compute-intensive applications require for good performance is challenging. With this motivation,(More)
BigData manipulates a massive volume of data for which the traditional techniques are not effective. Apache Hadoop is currently a most popular software framework supporting BigData analysis. As the scale of Hadoop cluster grows larger, building Hadoop clusters in virtualized environment draws a great attention. However, the performance optimization of(More)
Solid state drives (SSDs) have been widely used in Hadoop clusters ever since their introduction to the big data industry. However, the current Hadoop framework is not optimized to take full advantage of SSDs. In this paper, we introduce architectural improvements in the core Hadoop components to fully exploit the performance benefits of SSDs for data-and(More)
  • 1