Chengliang Zhang

Learn More
This paper studies the problem of building multiclass classifiers for tissue classification based on gene expression. The recent development of microarray technologies has enabled biologists to quantify gene expression of tens of thousands of genes in a single experiment. Biologists have begun collecting gene expression for a large number of samples. One of(More)
Many sequential applications are difficult to parallelize because of unpredictable control flow, indirect data access, and input-dependent parallelism. These difficulties led us to build a software system for behavior oriented parallelization (BOP), which allows a program to be parallelized based on partial information about program behavior, for example, a(More)
Fast track is a software speculation system that enables unsafe optimization of sequential code. It speculatively runs optimized code to improve performance and then checks the correctness of the speculative code by running the original program on multiple processors. We present the interface design and system implementation for Fast Track. It lets a(More)
In POPL 2002, Petrank and Rawitz showed a universal result---finding optimal data placement is not only NP-hard but also impossible to approximate within a constant factor if <i>P</i> &#8800; <i>NP</i>. Here we study a recently published concept called <i>reference affinity</i>, which characterizes a group of data that are always accessed together in(More)
Most application's performance is impacted by the amount of available memory. In a traditional application, which has a fixed working set size, increasing memory has a beneficial effect up until the application's working set is met. In the presence of garbage collection this relationship becomes more complex. While increasing the size of the program's heap(More)
Good data layouts improve cache and TLB performance of object-oriented software, but unfortunately, selecting an optimal data layout a priori is NP-hard. This paper introduces layout auditing, a technique that selects the best among a set of layouts online (while the program is running). Layout auditing randomly applies different layouts over time and(More)
Good spatial locality alleviates both the latency and bandwidth problem of memory by boosting the effect of prefetching and improving the utilization of cache. However, conventional definitions of spatial locality are inadequate for a programmer to precisely quantify the quality of a program, to identify causes of poor locality, and to estimate the(More)
Utility programs, which perform similar and largely independent operations on a sequence of inputs, include such common applications as compilers, interpreters, and document parsers; databases; and compression and encoding tools. The repetitive behavior of these programs, while often clear to users, has been difficult to capture automatically. We present an(More)
Software transactional memory systems enable a programmer to easily write concurrent data structures such as lists, trees, hashtables, and graphs, where non-conflicting operations proceed in parallel. Many of these structures take the abstract form of a dictionary, in which each transaction is associated with a search key. By regrouping transactions based(More)
In classic pattern recognition problems, classes are mutually exclusive by definition. However, in many applications, it is quite natural that some instances belong to multiple classes at the same time. In other words, these applications are multi-labeled, classes are overlapped by definition and each instance may be associated to multiple classes. In this(More)