Efficient path profiling

@article{Ball1996EfficientPP,
  title={Efficient path profiling},
  author={Thomas Ball and James R. Larus},
  journal={Proceedings of the 29th Annual IEEE/ACM International Symposium on Microarchitecture. MICRO 29},
  year={1996},
  pages={46-57}
}
  • T. BallJ. Larus
  • Published 2 December 1996
  • Computer Science
  • Proceedings of the 29th Annual IEEE/ACM International Symposium on Microarchitecture. MICRO 29
A path profile determines how many times each acyclic path in a routine executes. [] Key Method Instrumented programs run with overhead comparable to the best previous profiling techniques. On the SPEC95 benchmarks, path profiling overhead averaged 31%, as compared to 16% for efficient edge profiling. Path profiling also identifies longer paths than a previous technique, which predicted paths from edge profiles (average of 88, versus 34 instructions). Moreover, profiling shows that the SPEC95 train input…

Figures and Tables from this paper

Profiling All Paths

A new profiling technique called PAP (profiling all paths), which can profile finite-length paths inside a procedural and how to use PAP to profile executed sequences on the method level is discussed.

Practical path profiling for dynamic optimizers

The branch-flow metric is introduced to measure path flow as a function of branch decisions, rather than weighting all paths equally as in prior work, to make PPP appealing for use by dynamic compilers.

Representing, Detecting, and Profiling Paths in Hardware

A low-overhead, programmable hardware path profiling scheme that can be configured to detect a variety of paths including acyclic, intraprocedural paths, extended paths and sub-paths for the Whole Program Path and track one of the many architectural metrics along paths.

Targeted path profiling: lower overhead path profiling for staged dynamic optimization systems

The results suggest that on average the overhead of profile collection can be reduced by half (SPEC95) to almost two-thirds (SPEC2000) relative to the Ball-Larus algorithm with minimal impact on the information collected.

Profiling all paths: A new profiling technique for both cyclic and acyclic paths

A programmable hardware path profiler

A low-overhead, non-intrusive hardware path profiling scheme that can be programmed to detect several types of paths including acyclic, intra-procedural paths, paths for a whole program path and extended paths, enabling context-sensitive performance monitoring and bottleneck analysis.

An efficient online path profiling framework for Java just-in-time compilers

An efficient online path profiling technique, called structural path profiling (SPP), suitable for JIT compilers, to partition the target method into a hierarchy of the nested graphs based on the loop structure, and then to profile each graph independently.

Adaptive Path Profiling Using Arithmetic Coding

  • Gonglong ChenWei Dong
  • Computer Science
    2015 IEEE 21st International Conference on Parallel and Distributed Systems (ICPADS)
  • 2015
AdapTracer is a path profiling approach based on arithmetic coding that reduces the trace size by 44% on average and incurs execution overhead by 10% at most compared to PAP, and is adaptive by explicitly considering the execution frequency of each edge.

Continuous path and edge profiling

PEP is presented, a hybrid instrumentation and sampling approach for continuous path and edge profiling that is efficient, accurate, and portable, and reduces overhead by using profiling to guide instrumentation placement.

Extending path profiling across loop backedges and procedure boundaries

This work extends Ball Larus paths to create slightly longer overlapping paths and develops an instrumentation algorithm to collect their frequencies, which enable very precise estimation of frequencies of potentially much longer interesting paths.
...

References

SHOWING 1-10 OF 27 REFERENCES

Optimally profiling and tracing programs

Algorithms for inserting monitoring code to profile and trace programs that greatly reduce the cost of measuring programs and reduce the file size and overhead of an already highly optimized tracing system are presented.

Branch prediction for free

This work presents a program-based branch predictor that performs well for a large and diverse set of programs written in C and Fortran and focuses on heuristics for predicting non-loop branches, which dominate the dynamic branch count of many programs.

Trace Selection For Compiling Large C Application Programs To Microcode

  • P. ChangW. Hwu
  • Computer Science
    [1988] Proceedings of the 21st Annual Workshop on Microprogramming and Microarchitecture - MICRO '21
  • 1988
This work reports the distribution of control transfers categorized according to their potential impact on the microcode optimizations using the IMPACT C compiler, which contains integrated profiling and analysis tools.

Improving the accuracy of static branch prediction using branch correlation

A profile-based code transformation that exploits branch correlation to improve the accuracy of static branch prediction schemes and encodes branch history information in the program counter through the duplication and placement of program basic blocks.

Efficiently counting program events with support for on-line queries

This paper represents an instrumentation method for efficiently counting events in a program's execution, with support for on-line queries of the event count, and guarantees that accurate event counts can be obtained efficiently at every point in the execution.

Predicting conditional branch directions from previous runs of a program

It is suggested that even code with a complex flow of control, including systems utilities and language processors written in C, are dominated by branches which go in one way, and that this direction usually varies little when one changes the data used as the predictor and target.

Bulldog: A Compiler for VLIW Architectures

The Bulldog compiler described here uses several new compilation techniques: trace scheduling to find more parallelism, memory-reference and memorybank disambiguation to increase memory bandwidth, and new code-generation algorithms.

Trace Scheduling: A Technique for Global Microcode Compaction

  • J. A. Fisher
  • Computer Science
    IEEE Transactions on Computers
  • 1981
Compilation of high-level microcode languages into efficient horizontal microcode and good hand coding probably both require effective global compaction techniques.

EEL: machine-independent executable editing

EEL supports a machine- and system-independent editing model that enables tool builders to modify an executable without being aware of the details of the underlying architecture or operating system or being concerned with the consequences of deleting instructions or adding foreign code.

Lams and Eric Schnarr . EEL : Machineindependent executable editing

  • 1995