Learn More
Accelerating program performance via SIMD vector units is very common in modern processors, as evidenced by the use of SSE, MMX, VSE, and VSX SIMD instructions in multimedia, scientific, and embedded applications. To take full advantage of the vector capabilities, a compiler needs to generate efficient vector code automatically. However, most commercial and(More)
In this paper, we present the Habanero-Java (HJ) language developed at Rice University as an extension to the original Java-based definition of the X10 language. HJ includes a powerful set of task-parallel programming constructs that can be added as simple extensions to standard Java programs to take advantage of today's multi-core and heterogeneous(More)
Existing dynamic race detectors suffer from at least one of the following three limitations: (i)<i>space</i> overhead per memory location grows linearly with the number of parallel threads [13], severely limiting the parallelism that the algorithm can handle; (ii)<i>sequentialization</i>: the parallel program must be processed in a sequential order,(More)
Type systems that prevent data races are a powerful tool for parallel programming, eliminating whole classes of bugs that are both hard to find and hard to fix. Unfortunately, it is difficult to apply previous such type systems to " real " programs, as each of them are designed around a specific synchronization primitive or parallel pattern, such as locks(More)
Thymic stromal lymphopoietin (TSLP) is said to increase expression of chemokines attracting Th2 T cells. We hypothesized that asthma is characterized by elevated bronchial mucosal expression of TSLP and Th2-attracting, but not Th1-attracting, chemokines as compared with controls, with selective accumulation of cells bearing receptors for these chemokines.(More)
Modern computer systems feature multiple homogeneous or heterogeneous computing units with deep memory hierarchies, and expect a high degree of thread-level parallelism from the software. Exploitation of data locality is critical to achieving scalable parallelism, but adds a significant dimension of complexity to performance optimization of parallel(More)
One of the major productivity hurdles for parallel programming is non-determinism — a parallel program may yield different results on different executions with the same input, depending on the order in which operations are interleaved. A major source of non-determinism is data races, and checking for the absence of data races is an important candidate for(More)
Increasing the number of instructions executing in parallel has helped improve processor performance, but the technique is limited. Executing code on parallel threads and processors has fewer limitations, but most computer programs tend to be serial in nature. This paper presents a compiler optimisation that at run-time parallelises code inside a JVM and(More)