An approach and benchmark to detect behavioral changes of commits in continuous integration

@article{Danglot2020AnAA,
  title={An approach and benchmark to detect behavioral changes of commits in continuous integration},
  author={Benjamin Danglot and Monperrus Martin and Walter Rudametkin and Beno{\^i}t Baudry},
  journal={Empirical Software Engineering},
  year={2020},
  volume={25},
  pages={2379-2415}
}
When a developer pushes a change to an application’s codebase, a good practice is to have a test case specifying this behavioral change. Thanks to continuous integration (CI), the test is run on subsequent commits to check that they do no introduce a regression for that behavior. In this paper, we propose an approach that detects behavioral changes in commits. As input, it takes a program, its test suite, and a commit. Its output is a set of test methods that capture the behavioral difference… Expand
Analysis of the behavioral impact of code modifications
Code reviewing is the process to review code modifications by a peer in order to reduce the risk of regression (also known as a ”pull request” in GitHub). Historically, the review process is based onExpand
Can We Trust Tests To Automate Dependency Updates? A Case Study of Java Projects
TLDR
The combination of static and dynamic analysis should be a requirement for future dependency updating systems because of the prevalence of tests exercising dependencies and the effectiveness of test suites in detecting semantic faults in dependencies. Expand
Developer-Centric Test Amplification The Interplay Between Automatic Generation and Human Exploration
TLDR
This paper conducts 16 semi-structured interviews with software developers supported by the prototypical designs of a developer-centric test amplification approach and a corresponding test exploration tool, and extends the test amplification tool DSpot, generating test cases that are easier to understand. Expand
Can We Spot Energy Regressions using Developers Tests?
TLDR
This study investigates if the CI can leverage developers’ tests to perform a new class of test: the energy regression testing, similar to performance regression, but focused on the energy consumption of the program instead of standard performance indicators, like execution time or memory consumption. Expand
Automatic Unit Test Amplification For DevOps
TLDR
This thesis aims at addressing the lack of a tool that assists developers in regression testing by using test suite amplification, and proposes a new approach based on both test inputs transformation and assertions generation to amplify the test suite. Expand
Amplification Automatique de Tests Unitaires pour DevOps
Au cours des dernieres annees, les tests unitaires sont devenus un element essentiel de tout projetlogiciel serieux afin de verifier son bon fonctionnement.Cependant, les tests sont fastidieux etExpand
A Differential Testing Approach for Evaluating Abstract Syntax Tree Mapping Algorithms
TLDR
A hierarchical approach is proposed to automatically compare the similarity of mapped statements and tokens by different algorithms to determine if each of the compared algorithms generates inaccurate mappings for a statement or its tokens. Expand
Production Monitoring to Improve Test Suites
TLDR
An approach called PANKTI is devised which monitors applications as they execute in production, and then automatically generates unit tests from the collected production data, and shows that the generated tests indeed improve the quality of the test suite of the application under consideration. Expand

References

SHOWING 1-10 OF 47 REFERENCES
Automated Behavioral Regression Testing
  • Wei Jin, A. Orso, Tao Xie
  • Computer Science
  • 2010 Third International Conference on Software Testing, Verification and Validation
  • 2010
TLDR
Behavioral Regression Testing, a novel approach to regression testing that focuses on a subset of the code and leveraging differential behavior, can provide developers with more (and more detailed) information than traditional regression testing techniques. Expand
Shadow of a Doubt: Testing for Divergences between Software Versions
TLDR
A symbolic execution-based technique that is designed to generate test inputs that cover the new program behaviours introduced by a patch and evaluated on the Coreutils patches from the CoREBench suite of regression bugs shows that it is able to generatetest inputs that exercise newly added behaviours and expose some of the regression bugs. Expand
Shadow Symbolic Execution for Testing Software Patches
TLDR
A symbolic execution-based technique that is designed to generate test inputs that cover the new program behaviours introduced by a patch and evaluated on the Coreutils patches from the CoREBench suite of regression bugs shows that it is able to generatetest inputs that exercise newly added behaviours and expose some of the regression bugs. Expand
BEARS: An Extensible Java Bug Benchmark for Automatic Program Repair Studies
TLDR
BEARS, a project for collecting and storing bugs into an extensible bug benchmark for automatic repair studies in Java, is presented, and the version 1.0 of BEARS is delivered, which contains 251 reproducible bugs collected from 72 projects that use the Travis CI and Maven build environment. Expand
KATCH: high-coverage testing of software patches
TLDR
The results show that KATCH can automatically synthesise inputs that significantly increase the patch coverage achieved by the existing manual test suites, and find bugs at the moment they are introduced. Expand
Semantics-assisted code review: An efficient tool chain and a user study
TLDR
An invariant-mining tool chain, Getty, is created and it is demonstrated that semantically-assisted code review is feasible, effective, and that real programmers can leverage it to improve the quality of their reviews. Expand
DiffGen: Automated Regression Unit-Test Generation
  • Kunal Taneja, Tao Xie
  • Computer Science
  • 2008 23rd IEEE/ACM International Conference on Automated Software Engineering
  • 2008
TLDR
Experimental results show that the approach can effectively expose many behavioral differences that cannot be exposed by state-of-the-art techniques. Expand
Automatic test improvement with DSpot: a study with ten mature open-source projects
TLDR
This paper presents the concept, design and implementation of a system, that takes developer-written test cases as input (JUnit tests in Java) and synthesizes improved versions of them as output and shows that DSpot is capable of automatically improving unit-tests in real-world, large scale Java software. Expand
How Open Source Projects Use Static Code Analysis Tools in Continuous Integration Pipelines
TLDR
Study of the usage of static analysis tools in 20 Java open source projects hosted on GitHub and using Travis CI as continuous integration infrastructure reveals that build breakages are quickly fixed by actually solving the problem, rather than by disabling the warning, and are often properly documented. Expand
Differential symbolic execution
TLDR
A novel extension and application of symbolic execution techniques that computes a precise behavioral characterization of a program change that exploits the fact that program versions are largely similar to reduce cost and improve the quality of analysis results is introduced. Expand
...
1
2
3
4
5
...