Towards the Use of the Readily Available Tests from the Release Pipeline as Performance Tests. Are We There Yet?

@article{Ding2020TowardsTU,
  title={Towards the Use of the Readily Available Tests from the Release Pipeline as Performance Tests. Are We There Yet?},
  author={Zishuo Ding and Jinfu Chen and Weiyi Shang},
  journal={2020 IEEE/ACM 42nd International Conference on Software Engineering (ICSE)},
  year={2020},
  pages={1435-1446}
}
Performance is one of the important aspects of software quality. Performance issues exist widely in software systems, and the process of fixing the performance issues is an essential step in the release cycle of software systems. Although performance testing is widely adopted in practice, it is still expensive and time-consuming. In particular, the performance testing is usually conducted after the system is built in a dedicated testing environment. The challenges of performance testing make it… 

Figures and Tables from this paper

Locating Performance Regression Root Causes in the Field Operations of Web-based Systems: An Experience Report
TLDR
This work designs and adopts an approach that automatically locates the root causes of performance regressions while the software systems are deployed and running in the field and it has been adopted by an industrial partner and used in practice on a daily basis over a 12-month period.
CP-Detector: Using Configuration-related Performance Properties to Expose Performance Bugs
TLDR
This paper argues that the performance expectation of configuration can serve as a strong oracle candidate for performance bug detection and designed and evaluated an automated performance testing framework, CP-DETECTOR, for detecting real-world configuration-related performance bugs.
Applying test case prioritization to software microbenchmarks
TLDR
Empirically study coverage-based TCP techniques, employing total and additional greedy strategies, applied to software microbenchmarks along multiple parameterization dimensions, leading to 54 unique technique instantiations, which demonstrate that the total strategy is superior to the additional strategy.
How Software Refactoring Impacts Execution Time
TLDR
This study mined the change history of 20 systems that defined performance benchmarks in their repositories, with the goal of identifying commits in which developers implemented refactoring operations impacting code components that are exercised by the performance benchmarks, and shows that refactors can significantly impact the execution time.
PerfJIT: Test-Level Just-in-Time Prediction for Performance Regression Introducing Commits
TLDR
This paper proposes an approach that automatically predicts whether a test would manifest performance regression given a code commit and can drastically reduce the testing time needed to detect performance regressions.
Dynamically reconfiguring software microbenchmarks: reducing execution time without sacrificing result quality
TLDR
This work proposes the first technique to dynamically stop software microbenchmark executions when their results are sufficiently stable, and is capable of reducing Java Microbenchmark Harness (JMH) suite execution times by 48.4% to 86.0%.
Predicting unstable software benchmarks using static source code features
TLDR
A machine-learning-based approach to predict a benchmark’s stability without having to execute it, and shows that although benchmark stability is affected by more than just the source code, it can effectively utilize machine learning models to predict whether a benchmark will be stable or not ahead of execution.
Evaluating the impact of falsely detected performance bug-inducing changes in JIT models
TLDR
An empirical study on the JIT defect prediction for performance bugs focuses on SZZ’s ability to identify the bug-inducing commits of performance bugs in two open-source projects, Cassandra, and Hadoop, and finds that manually correcting errors in the training data only slightly improves the models.
Can We Spot Energy Regressions using Developers Tests?
TLDR
This study investigates if the CI can leverage developers’ tests to perform a new class of test: the energy regression testing, similar to performance regression, but focused on the energy consumption of the program instead of standard performance indicators, like execution time or memory consumption.
Using application benchmark call graphs to quantify and improve the practical relevance of microbenchmark suites
TLDR
This paper shows how the practical relevance of microbenchmark suites can be improved and verified based on the application flow during an application benchmark run, and proposes an approach to determine the overlap of common function calls between application and microbenchmarks, and presents a recommendation algorithm which reveals relevant functions that are not covered by microbench marks yet.
...
1
2
...

References

SHOWING 1-10 OF 67 REFERENCES
An Exploratory Study of the State of Practice of Performance Testing in Java-Based Open Source Projects
TLDR
It is argued that future performance testing frameworks should provider better support for low-friction testing, for instance via non-parameterized methods or performance test generation, as well as focus on a tight integration with standard continuous integration tooling.
Continuous validation of performance test workloads
TLDR
An automated approach to validate whether a performance test resembles the field workload and, if not, determines how they differ and can then update their tests to eliminate such differences, hence creating more realistic tests.
How is Performance Addressed in DevOps?
TLDR
The survey responses indicate that the complexity of performance engineering approaches and tools is a barrier for wide-spread adoption of performance analysis in DevOps, and performance analysis tools need to have a short learning curve and be easy to integrate into the DevOps pipeline in order to be adopted by practitioners.
An Exploratory Study of Performance Regression Introducing Code Changes
  • Jinfu Chen, Weiyi Shang
  • Computer Science
    2017 IEEE International Conference on Software Maintenance and Evolution (ICSME)
  • 2017
TLDR
It is found that performance regressions widely exist during the development of both Hadoop and RxJava and that the majority of the performance regression are introduced while fixing other bugs.
An Automated Approach for Recommending When to Stop Performance Tests
TLDR
An automated approach is proposed that measures how much of the data that is generated during a performance test is repetitive and provides a recommendation to stop the test when the data becomes highly repetitive and the repetitiveness has stabilized.
Utilizing Performance Unit Tests To Increase Performance Awareness
TLDR
The goal is to provide the developer with information that would help form developer opinion, thus preventing performance loss due to the accumulated effect of many poor decisions, and turns performance unit tests into recipes for generating performance documentation.
Software microbenchmarking in the cloud. How bad is it really?
TLDR
The effects of cloud environments on the variability of performance test results and to what extent slowdowns can still be reliably detected even in a public cloud are explored and Wilcoxon rank-sum manages to detect smaller slowdowns in cloud environments.
Performance regression testing target prioritization via performance risk analysis
TLDR
A new lightweight and white-box approach, performance risk analysis (PRA), is proposed to improve performance regression testing efficiency via testing target prioritization and can leverage the analysis result to test commits with high risks first while delaying or skipping testing on low-risk commits.
Performance Regression Unit Testing: A Case Study
TLDR
The Stochastic Performance Logic is evaluated in the context of performance unit testing of JDOM, an open source project for working with XML data, to focus on the ability to capture and test developer assumptions and on the practical behavior of the built in hypothesis testing when the formal assumptions of the tests are not met.
Unit Testing Performance in Java Projects: Are We There Yet?
TLDR
A study of GitHub projects written in Java, looking for occurrences of performance evaluation code in common performance testing frameworks, quantifies the use of such frameworks, identifies the most relevant performance testing approaches, and describes how the design of the SPL performance testing framework is adjusted.
...
1
2
3
4
5
...