On scheduling in map-reduce and flow-shops


The map-reduce paradigm is now standard in industry and academia for processing large-scale data. In this work, we formalize job scheduling in map-reduce as a novel generalization of the two-stage classical <i>flexible</i> flow shop (FFS) problem: instead of a single task at each stage, a job now consists of a set of tasks per stage. For this generalization, we consider the problem of minimizing the total <i>flowtime</i> and give an efficient 12-approximation in the offline setting and an online (1+&#181;)-speed <i>O</i>(1/&#181;<sup>2</sup>)-competitive algorithm. Motivated by map-reduce, we revisit the two-stage flow shop problem, where we give a dynamic program for minimizing the total <i>flowtime</i> when all jobs arrive at the same time. If there are fixed number of job-types the dynamic program yields a PTAS; it is also a QPTAS when the processing times of jobs are polynomially bounded. This gives the first improvement in approximation of flowtime for the two-stage flow shop problem since the trivial 2-approximation algorithm of Gonzalez and Sahni [29] in 1978, and the first known approximation for the FFS problem. We then consider the generalization of the two-stage FFS problem to the unrelated machines case, where we give an offline 6-approximation and an online (1+&#181;)-speed <i>O</i>(1/&#181;<sup>4</sup>)-competitive algorithm.

DOI: 10.1145/1989493.1989540

Extracted Key Phrases

1 Figure or Table

Citations per Year

91 Citations

Semantic Scholar estimates that this publication has 91 citations based on the available data.

See our FAQ for additional information.

Cite this paper

@inproceedings{Moseley2011OnSI, title={On scheduling in map-reduce and flow-shops}, author={Benjamin Moseley and Anirban Dasgupta and Ravi Kumar and Tam{\'a}s Sarl{\'o}s}, booktitle={SPAA}, year={2011} }