Map/Reduce style data-parallel computation is characterized by the extensive use of user-defined functions for data processing and relies on data-shuffling stages to prepare data partitions for parallel computation. Instead of treating user-defined functions as " black boxes " , we propose to analyze those functions to turn them into " gray boxes " that… (More)
To minimize the amount of data-shuffling I/O that occurs between the pipeline stages of a distributed data-parallel program, its procedural code must be optimized with full awareness of the pipeline that it executes in. Unfortunately, neither pipeline optimizers nor traditional compilers examine both the pipeline and procedural code of a data-parallel… (More)
Configuration problems are not only prevalent, but also severely impair the reliability of today's system software. One fundamental reason is the ever-increasing complexity of configuration, reflected by the large number of configuration parameters ("knobs"). With hundreds of knobs, configuring system software to ensure high reliability and performance… (More)
Data races are ubiquitous in multi-threaded applications, but they are by no means easy to detect. One of the most important reasons is the complexity of thread interleavings. A volume of research has been devoted to the interleaving-insensitive detection. However, all the previous work focuses on the uniform detection (unknown to the characteristics of… (More)
Expressing synchronization in task parallelism remains a significant challenge because of the complicated relationships between tasks. In this paper, we propose a novel parallel programming model, namely function flow, where synchronization is easier to express. We release the burden of synchronizing by the virtue of parallel functions and functional wait.… (More)
Computer Systems and Networks, with the goal of making computer systems and networks more dependable and manageable.