MapReduce Online

  title={MapReduce Online},
  author={Tyson Condie and Neil Conway and Peter Alvaro and Joseph M. Hellerstein and Khaled Elmeleegy and Russell Sears},
MapReduce is a popular framework for data-intensive distributed computing of batch jobs. To simplify fault tolerance, the output of each MapReduce task and job is materialized to disk before it is consumed. In this paper, we propose a modified MapReduce architecture that allows data to be pipelined between operators. This extends the MapReduce programming model beyond batch processing, and can reduce completion times and improve system utilization for batch jobs as well. We present a modified… CONTINUE READING
Highly Influential
This paper has highly influenced 57 other papers. REVIEW HIGHLY INFLUENTIAL CITATIONS
Highly Cited
This paper has 553 citations. REVIEW CITATIONS

9 Figures & Tables



Citations per Year

554 Citations

Semantic Scholar estimates that this publication has 554 citations based on the available data.

See our FAQ for additional information.