Multi-Query Optimization in MapReduce Framework


MapReduce has recently emerged as a new paradigm for large-scale data analysis due to its high scalability, finegrained fault tolerance and easy programming model. Since different jobs often share similar work (e.g., several jobs scan the same input file or produce the same map output), there are many opportunities to optimize the performance for a batch of… (More)


