Learn More
Map/Reduce style data-parallel computation is characterized by the extensive use of user-defined functions for data processing and relies on data-shuffling stages to prepare data partitions for parallel computation. Instead of treating user-defined functions as “black boxes”, we propose to analyze those functions to turn them into “gray boxes” that expose(More)
Configuration problems are not only prevalent, but also severely impair the reliability of today's system software. One fundamental reason is the ever-increasing complexity of configuration, reflected by the large number of configuration parameters ("knobs"). With hundreds of knobs, configuring system software to ensure high reliability and performance(More)
BACKGROUND The presence of intracellular organisms (ICOs) in polymorphonuclear leukocytes obtained from bronchoalveolar lavage fluid (BALF) is a possible method for rapid diagnosis of ventilator-associated pneumonia (VAP). However, the validity of this diagnostic method remains controversial and the diagnostic thresholds reported by investigators were(More)
To minimize the amount of data-shuffling I/O that occurs between the pipeline stages of a distributed data-parallel program, its procedural code must be optimized with full awareness of the pipeline that it executes in. Unfortunately, neither pipeline optimizers nor traditional compilers examine both the pipeline and procedural code of a data-parallel(More)
Transactional memory (TM) is a parallel programming concept which reduces challenges in parallel programming. Existing distributed transactional memory system consumes too much bandwidth and brings high latency. In this work, we present Transactional Memory System for Cluster (Clustm), a generalized and scalable distributed transactional memory system. Our(More)
Expressing synchronization in task parallelism remains a significant challenge because of the complicated relationships between tasks. In this paper, we propose a novel parallel programming model, namely function flow, where synchronization is easier to express. We release the burden of synchronizing by the virtue of parallel functions and functional wait.(More)
Data races are ubiquitous in multi-threaded applications, but they are by no means easy to detect. One of the most important reasons is the complexity of thread interleavings. A volume of research has been devoted to the interleaving-insensitive detection. However, all the previous work focuses on the uniform detection (unknown to the characteristics of(More)
The largest difference between a distributed and a non-distributed system is that the former introduces network messages to the system. Network messages bring the scalability to a distributed system as well as complexity to it. Testing large-scale distributed systems is a great challenge, because some errors happen after a distributed sequence of events(More)
Transactional memory (TM) is a parallel programming concept. Existing consistency protocols in distributed transactional memory system consume too much bandwidth and bring high latency. In this paper, we propose our Transaction Memory Consistency Protocol (TMCP), and point the new features compared to the current protocols. After formulating our model and(More)
  • 1