Matthew P. LeGendre

Learn More
This paper describes PLTO, a link-time instrumentation and optimization tool we have developed for the Intel IA-32 architecture. A number of characteristics of this architecture complicate the task of link-time optimization. These include a large number of op-codes and addressing modes, which increases the complexity of program analysis; variable-length(More)
Petascale systems will present several new challenges to performance and correctness tools. Such machines may contain millions of cores, requiring that tools use scalable data structures and analysis algorithms to collect and to process application data. In addition, at such scales, each tool itself will become a large parallel application - already,(More)
Large HPC centers spend considerable time supporting software for thousands of users, but the complexity of HPC software is quickly outpacing the capabilities of existing software management tools. Scientific applications require specific versions of compilers, MPI, and other dependency libraries, so using a single, standard software stack is infeasible.(More)
As the sizes of high-end computing systems continue to grow to massive scales, efficient bootstrapping for distributed software infrastructures is becoming a greater challenge. Distributed software infrastructure bootstrapping is the procedure of instantiating all processes of the distributed system on the appropriate hardware nodes and disseminating to(More)
As scientific computation continues to scale, efficient use of floating-point arithmetic processors is critical. Lower precision allows streaming architectures to perform more operations per second and can reduce memory bandwidth pressure on all architectures. However, using a precision that is too low for a given algorithm and data set leads to inaccurate(More)
Dynamic linking has many advantages for managing large code bases, but dynamically linked applications have not typically scaled well on high performance computing systems. Splitting a monolithic executable into many dynamic shared object (DSO) files decreases compile time for large codes, reduces runtime memory requirements by allowing modules to be loaded(More)
Many performance engineering tasks, from long-term performance monitoring to post-mortem analysis and online tuning, require efficient runtime methods for introspection and performance data collection. To understand interactions between components in increasingly modular HPC software, performance introspection hooks must be integrated into runtime systems,(More)
As the sizes of high-end computing systems continue to grow to massive scales, efficient boostrapping for distributed software infrastructures is becoming a greater challenge. Distributed infrastructure bootstrapping is the procedure of instantiating all processes of the distributed system on the appropriate hardware nodes and disseminating to these(More)
Large-scale systems typically mount many different file systems with distinct performance characteristics and capacity. Applications must efficiently use this storage in order to realize their full performance potential. Users must take into account potential file replication throughout the storage hierarchy as well as contention in lower levels of the I/O(More)
STUDY OBJECTIVE To compare two different methods of postoperative analgesia after extensive spinal fusion. DESIGN Double-blind, randomized study. SETTING University-affiliated hospital. PATIENTS Twenty four adult patients undergoing scoliosis correction. INTERVENTIONS Before the end of surgery, patients received either intravenous clonidine 0.3(More)