Learn More
The MapReduce model is becoming prominent for the large-scale data analysis in the cloud. In this paper, we present the benchmarking, evaluation and characterization of Hadoop, an open-source implementation of MapReduce. We first introduce HiBench, a new benchmark suite for Hadoop. It consists of a set of Hadoop programs, including both synthetic(More)
To improve software productivity, when constructing new software systems, programmers often reuse existing libraries or frameworks by invoking methods provided in their APIs. Those API methods, however, are often complex and not well documented. To get familiar with how those API methods are used, programmers often exploit a source code search tool to(More)
Object-oriented unit tests consist of sequences of method invocations. Behavior of an invocation depends on the state of the receiver object and method arguments at the beginning of the invocation. Existing tools for automatic generation of object-oriented test suites, such as Jtest and J Crasher for Java, typically ignore this state and thus generate(More)
Achieving high structural coverage such as branch coverage in object-oriented programs is an important and yet challenging goal due to two main challenges. First, some branches involve complex program logics and generating tests to cover them requires deep knowledge of the program structure and semantics. Second, covering some branches requires special(More)
An open source project typically maintains an open bug repository so that bug reports from all over the world can be gathered. When a new bug report is submitted to the repository, a person, called a triager, examines whether it is a duplicate of an existing bug report. If it is, the triager marks it as DUPLICATE and the bug report is removed from(More)
Programmers commonly reuse existing frameworks or libraries to reduce software development efforts. One common problem in reusing the existing frameworks or libraries is that the programmers know what type of object that they need, but do not know how to get that object with a specific method sequence. To help programmers to address this issue, we have(More)
XACML has become the de facto standard for specifying access control policies for various applications, especially web services. With the explosive growth of web applications deployed on the Internet, XACML policies grow rapidly in size and complexity, which leads to longer request processing time. This paper concerns the performance of request processing,(More)
Due to the expensiveness of compiling and executing a large number of mutants, it is usually necessary to select a subset of mutants to substitute the whole set of generated mutants in mutation testing and analysis. Most existing research on mutant selection focused on operator-based mutant selection, i.e., determining a set of sufficient mutation operators(More)
The genetics of complex disease produce alterations in the molecular interactions of cellular pathways whose collective effect may become clear through the organized structure of molecular networks. To characterize molecular systems associated with late-onset Alzheimer's disease (LOAD), we constructed gene-regulatory networks in 1,647 postmortem brain(More)