MapReduce scheduling in hybrid cloud with multi-level privacy

Abstract

MapReduce is a popular programming model for analyzing large datasets. It allows for parallel processing of large amounts of data over commodity computers or cloud computing resources. Hadoop is an open-source implementation of MapReduce and uses the Capacity or Fair scheduler to assign <i>map</i> and <i>reduce</i> tasks to the various computing nodes. They both address fair share of resources between the users but do not consider any privacy constraint, an important concern when using cloud resources due to clouds generally having a different risk of exposure. Jobs are currently sent to cloud resources disregarding the different privacy implementations of each cloud involved. This paper suggest a multi-level privacy scheduler with data clearance levels and cloud authorization levels so the mapping of <i>map</i> and <i>reduce</i> tasks to cloud resources can satisfy the organization's privacy constraints.

DOI: 10.1145/2837185.2837256

Extracted Key Phrases

4 Figures and Tables

Cite this paper

@inproceedings{Degryse2015MapReduceSI, title={MapReduce scheduling in hybrid cloud with multi-level privacy}, author={Toon Degryse and Sucha Smanchat}, booktitle={iiWAS}, year={2015} }