Learn More
This paper studies the problem of building multiclass classifiers for tissue classification based on gene expression. The recent development of microarray technologies has enabled biologists to quantify gene expression of tens of thousands of genes in a single experiment. Biologists have begun collecting gene expression for a large number of samples. One of(More)
Many sequential applications are difficult to parallelize because of unpredictable control flow, indirect data access, and input-dependent parallelism. These difficulties led us to build a software system for behavior oriented parallelization (BOP), which allows a program to be parallelized based on partial information about program behavior, for example, a(More)
Most application's performance is impacted by the amount of available memory. In a traditional application, which has a fixed working set size, increasing memory has a beneficial effect up until the application's working set is met. In the presence of garbage collection this relationship becomes more complex. While increasing the size of the program's heap(More)
Fast track is a software speculation system that enables unsafe optimization of sequential code. It speculatively runs optimized code to improve performance and then checks the correctness of the speculative code by running the original program on multiple processors. We present the interface design and system implementation for Fast Track. It lets a(More)
In classic pattern recognition problems, classes are mutually exclusive by definition. However, in many applications, it is quite natural that some instances belong to multiple classes at the same time. In other words, these applications are multi-labeled, classes are overlapped by definition and each instance may be associated to multiple classes. In this(More)
Software transactional memory systems enable a programmer to easily write concurrent data structures such as lists, trees, hashtables, and graphs, where non-conflicting operations proceed in parallel. Many of these structures take the abstract form of a dictionary, in which each transaction is associated with a search key. By regrouping transactions based(More)
In POPL 2002, Petrank and Rawitz showed a universal result---finding optimal data placement is not only NP-hard but also impossible to approximate within a constant factor if <i>P</i> &#8800; <i>NP</i>. Here we study a recently published concept called <i>reference affinity</i>, which characterizes a group of data that are always accessed together in(More)
As the amount of on-chip cache increases as a result of Moore's law, cache utilization is increasingly important as the number of processor cores multiply and the contention for memory bandwidth becomes more severe. Optimal cache management requires knowing the future access sequence and being able to communicate this information to hardware. The paper(More)
In the past, program monitoring often operates at the code level, performing checks at function and loop boundaries. Recent research shows that profiling analysis can identify high-level phases in complex binary code. Examples are time steps in scientific simulations and service cycles in utility programs. Because of their larger size and more predictable(More)