Kim M. Hazelwood

Learn More
Robust and powerful software instrumentation tools are essential for program analysis tasks such as profiling, performance evaluation, and bug detection. To meet this need, we have developed a new instrumentation system called <i>Pin</i>. Our goals are to provide <i>easy-to-use, portable, transparent</i>, and <i>efficient</i> instrumentation.(More)
General purpose GPU Computing (GPGPU) has taken off in the past few years, with great promises for increased desktop processing power due to the large number of fast computing cores on high-end graphics cards. Many publications have demonstrated phenomenal performance and have reported speedups as much as 1000&#x00D7; over code running on multi-core CPUs.(More)
Dynamic instrumentation systems have proven to be extremely valuable for program introspection, architectural simulation, and bug detection. Yet a major drawback of modern instrumentation systems is that the instrumented applications often execute several orders of magnitude slower than native application performance. In this paper, we present a novel(More)
Software instrumentation provides the means to collect information on and efficiently analyze parallel programs. Using Pin, developers can build tools to detect and examine dynamic behavior including data races, memory system behavior, and parallelizable loops. Pin is a software system that performs runtime binary instrumentation of Linux and Microsoft(More)
Dynamic binary instrumentation (DBI) is a powerful technique for analyzing the runtime behavior of software. While numerous DBI frameworks have been developed for general-purpose architectures, work on DBI frameworks for embedded architectures has been fairly limited. In this paper, we describe the design, implementation, and applications of the ARM version(More)
Dynamic binary optimizers store altered copies of original program instructions in software-managed code caches in order to maximize reuse of transformed code. Code caches store code blocks that may vary in size, reference other code blocks, and carry a high replacement overhead. These unique constraints reduce the effectiveness of conventional cache(More)
The performance of a dynamic optimization system depends heavily on the code it selects to optimize. Many current systems follow the design of HP Dynamo and select a single interprocedural path, or trace, as the unit of code optimization and code caching. Though this approach to region selection has worked well in practice, we show that it is possible to(More)
A dynamic optimizer is a runtime software system thatgroups a program's instruction sequences into traces, optimizesthose traces, stores the optimized traces in a software-basedcode cache, and then executes the optimized code inthe code cache. To maximize performance, the vast majorityof the program's execution should occur in the codecache and not in the(More)
A dynamic optimizer is a software-based system that performs code modifications at runtime, and several such systems have been proposed over the past several years. These systems typically perform optimization on the level of an instruction trace, and most use caching mechanisms to store recently optimized portions of code. Since the dynamic optimizers(More)