LEEN: Locality/Fairness-Aware Key Partitioning for MapReduce in the Cloud

  title={LEEN: Locality/Fairness-Aware Key Partitioning for MapReduce in the Cloud},
  author={Shadi Ibrahim and Hai Jin and Lu Lu and Song Wu and Beixin Julie He and Li Qi},
  journal={2010 IEEE Second International Conference on Cloud Computing Technology and Science},
This paper investigates the problem of Partitioning Skew in MapReduce-based system. Our studies with Hadoop, a widely used MapReduce implementation, demonstrate that the presence of partitioning skew causes a huge amount of data transfer during the shuffle phase and leads to significant unfairness on the reduce input among different data nodes. As a result, the applications experience performance degradation due to the long data transfer during the shuffle phase along with the computation skew… CONTINUE READING
Highly Influential
This paper has highly influenced 14 other papers. REVIEW HIGHLY INFLUENTIAL CITATIONS
Highly Cited
This paper has 192 citations. REVIEW CITATIONS

From This Paper

Figures, tables, and topics from this paper.


Publications citing this paper.
Showing 1-10 of 109 extracted citations

On Datacenter-Network-Aware Load Balancing in MapReduce

2015 IEEE 8th International Conference on Cloud Computing • 2015
View 14 Excerpts
Highly Influenced

A comprehensive view of Hadoop research - A systematic literature review

J. Network and Computer Applications • 2014
View 10 Excerpts
Highly Influenced

Locality-Aware Reduce Task Scheduling for MapReduce

2011 IEEE Third International Conference on Cloud Computing Technology and Science • 2011
View 4 Excerpts
Highly Influenced

An efficient key partitioning scheme for heterogeneous MapReduce clusters

2016 18th International Conference on Advanced Communication Technology (ICACT) • 2016
View 3 Excerpts
Highly Influenced

LIBRA: Lightweight Data Skew Mitigation in MapReduce

IEEE Transactions on Parallel and Distributed Systems • 2015
View 5 Excerpts
Highly Influenced

192 Citations

Citations per Year
Semantic Scholar estimates that this publication has 192 citations based on the available data.

See our FAQ for additional information.


Publications referenced by this paper.
Showing 1-10 of 21 references

Cloud Technologies for Bioinformatics Applications

IEEE Trans. Parallel Distrib. Syst. • 2011
View 2 Excerpts

MapReduce Online

NSDI • 2010
View 1 Excerpt

A simulation approach to evaluating design decisions in MapReduce setups

2009 IEEE International Symposium on Modeling, Analysis & Simulation of Computer and Telecommunication Systems • 2009
View 1 Excerpt

HPMR: Prefetching and pre-shuffling in shared MapReduce computation environment

2009 IEEE International Conference on Cluster Computing and Workshops • 2009
View 1 Excerpt

Similar Papers

Loading similar papers…