Redundant Loads: A Software Inefficiency Indicator
@article{Su2019RedundantLA, title={Redundant Loads: A Software Inefficiency Indicator}, author={Pengfei Su and Shasha Wen and Hailong Yang and Milind Chabbi and X. Liu}, journal={2019 IEEE/ACM 41st International Conference on Software Engineering (ICSE)}, year={2019}, pages={982-993} }
Modern software packages have become increasingly complex with millions of lines of code and references to many external libraries. Redundant operations are a common performance limiter in these code bases. Missed compiler optimization opportunities, inappropriate data structure and algorithm choices, and developers' inattention to performance are some common reasons for the existence of redundant operations. Developers mainly depend on compilers to eliminate redundant operations. However…
15 Citations
What every scientific programmer should know about compiler optimizations?
- Computer ScienceICS
- 2020
This paper investigates an important compiler optimization---dead and redundant operation elimination and shows that modern compilers miss several optimization opportunities, in fact they even introduce some inefficiencies, which require programmers to refactor the source code.
Pinpointing performance inefficiencies in Java
- Computer ScienceESEC/SIGSOFT FSE
- 2019
JXPerf, a lightweight performance analysis tool for pinpointing wasteful memory operations in Java programs and optimizing several Java applications by improving code generation and choosing superior data structures and algorithms, which yield significant speedups.
ZeroSpy: Exploring Software Inefficiency with Redundant Zeros
- Computer ScienceSC20: International Conference for High Performance Computing, Networking, Storage and Analysis
- 2020
This paper proposes ZeroSpy - a fine-grained profiler to identify redundant zeros caused by both inappropriate use of data structures and useless computation and provides intuitive optimization guidance by revealing the locations where the redundantZeros happen in source lines and calling contexts.
BinGo: Pinpointing Concurrency Bugs in Go via Binary Analysis
- Computer ScienceArXiv
- 2022
BINGO is the first tool to identify concurrency bugs in Go applications via dynamic binary analysis, an endto-end tool that is ready for deployment in the production environment with no modification on source code, compilers, and runtimes in the Go eco-system.
Analyzing memory accesses with modern processors
- Computer ScienceDaMoN
- 2020
This work leverages a mechanism available in modern processors to collect memory traces via hardware-based sampling and illustrates how memory traces uncover new insights into the memory access characteristics of database systems.
DRCCTPROF: A Fine-Grained Call Path Profiler for ARM-Based Clusters
- Computer ScienceSC20: International Conference for High Performance Computing, Networking, Storage and Analysis
- 2020
The unique ability of DRCCTPROF is to obtain full calling context at any and every machine instruction that executes, which provides more detailed diagnostic feedback for performance optimization and correctness tools.
Toward efficient interactions between Python and native libraries
- Computer ScienceESEC/SIGSOFT FSE
- 2021
PieProf, a lightweight profiler, is developed to pinpoint interaction inefficiencies in Python applications and associate inefficiences with high-level Python code to provide a holistic view, and optimization of 17 realworld applications is guided.
CP-Detector: Using Configuration-related Performance Properties to Expose Performance Bugs
- Computer Science2020 35th IEEE/ACM International Conference on Automated Software Engineering (ASE)
- 2020
This paper argues that the performance expectation of configuration can serve as a strong oracle candidate for performance bug detection and designed and evaluated an automated performance testing framework, CP-DETECTOR, for detecting real-world configuration-related performance bugs.
GRAPHSPY: Fused Program Semantic-Level Embedding via Graph Neural Networks for Dead Store Detection
- Computer ScienceArXiv
- 2020
This work presents a novel, hybrid program embedding approach so that to derive unnecessary memory operations through the embedding, which achieves 90% of accuracy and incurs only around a half of time overhead of the state-of-art tool.
Can we trust profiling results?: understanding and fixing the inaccuracy in modern profilers
- Computer ScienceICS
- 2019
This paper studies performance monitoring units (PMU) based statistical sampling, one of the profiling techniques widely adopted by many state-of-the-art profilers, and proposes a novel 3-step approach to understand and fix the instruction profiling inaccuracy.
References
SHOWING 1-10 OF 85 REFERENCES
Runtime Value Numbering: A Profiling Technique to Pinpoint Redundant Computations
- Computer Science2015 International Conference on Parallel Architecture and Compilation (PACT)
- 2015
Redundant computations can severely degrade performance in HPC applications. Redundant computations arise due to various causes such as developers' inattention to performance, inappropriate choice of…
REDSPY: Exploring Value Locality in Software
- Computer ScienceASPLOS 2017
- 2017
REDSPY pinpointed dramatically high volume of redundancies in programs that were optimization targets for decades, such as SPEC CPU2006 suite, Rodinia benchmark, and NWChem---a production computational chemistry code, and was able to eliminate redundancies that resulted in significant speedups.
DeadSpy: a tool to pinpoint program inefficiencies
- Computer ScienceCGO '12
- 2012
DeadSpy is described --- a tool that dynamically detects every dead write to memory in a given execution and provides actionable feedback to the programmer, which provides a methodical way to identify dead writes, a common symptom of performance inefficiencies.
Barrier elision for production parallel programs
- Computer SciencePPoPP 2015
- 2015
Context-sensitive dynamic optimizations that elide barriers redundant during the program execution are presented that demonstrate the value of holistic context-sensitive analyses that consider the domain science in conjunction with the associated runtime software stack.
Pinpointing and Exploiting Opportunities for Enhancing Data Reuse
- Computer ScienceISPASS 2008 - IEEE International Symposium on Performance Analysis of Systems and software
- 2008
An approach that uses memory reuse distance to identify an application's most significant memory access patterns causing cache misses and provide insight into ways of improving data reuse is described.
Performance problems you can fix: a dynamic analysis of memoization opportunities
- Computer ScienceOOPSLA 2015
- 2015
This paper presents MemoizeIt, a dynamic analysis that identifies methods that repeatedly perform the same computation, a technique called memoization, which leads to statistically significant speedups by factors between 1.04x and 12.93x.
Pin: building customized program analysis tools with dynamic instrumentation
- Computer SciencePLDI '05
- 2005
The goals are to provide easy-to-use, portable, transparent, and efficient instrumentation, and to illustrate Pin's versatility, two Pintools in daily use to analyze production software are described.
Performance Diagnosis for Inefficient Loops
- Computer Science2017 IEEE/ACM 39th International Conference on Software Engineering (ICSE)
- 2017
A static-dynamic hybrid analysis tool, LDoctor, that can provide better coverage and accuracy than existing techniques, with low overhead, and use sampling techniques to lower the run-time overhead withoutdegrading the accuracy or latency of LDoctor diagnosis.
Continuous profiling: where have all the cycles gone?
- Computer ScienceTOCS
- 1997
The Digital Continuous Profiling Infrastructure is a sampling-based profiling system designed to run continuously on production systems, supporting multiprocessors, works on unmodified executables, and collects profiles for entire systems, including user programs, shared libraries, and the operating system kernel.