Toward understanding compiler bugs in GCC and LLVM

@article{Sun2016TowardUC,
  title={Toward understanding compiler bugs in GCC and LLVM},
  author={Chengnian Sun and Vu Le and Qirun Zhang and Zhendong Su},
  journal={Proceedings of the 25th International Symposium on Software Testing and Analysis},
  year={2016}
}
  • Chengnian Sun, Vu Le, +1 author Z. Su
  • Published 18 July 2016
  • Computer Science
  • Proceedings of the 25th International Symposium on Software Testing and Analysis
Compilers are critical, widely-used complex software. Bugs in them have significant impact, and can cause serious damage when they silently miscompile a safety-critical application. An in-depth understanding of compiler bugs can help detect and fix them. To this end, we conduct the first empirical study on the characteristics of the bugs in two main-stream compilers, GCC and LLVM. Our study is significant in scale — it exhaustively examines about 50K bugs and 30K bug fix revisions over more… Expand
Towards Understanding Tool-chain Bugs in the LLVM Compiler Infrastructure
TLDR
This paper conducts an empirical study of the LLVM tool-chain bugs, aiming to provide the first comprehensive understanding of these bugs. Expand
Compiler fuzzing: how much does it matter?
TLDR
The first quantitative and qualitative study of the tangible impact of miscompilation bugs in a mature compiler is presented, and a selection of the syntactic changes caused by some of the bugs (fuzzer-found and non fuzzer- found) in package assembly code shows that either these changes have no semantic impact or that they would require very specific runtime circumstances to trigger execution divergence. Expand
An empirical study of optimization bugs in GCC and LLVM
TLDR
An empirical study to investigate the characteristics of optimization bugs in two mainstream compilers, GCC and LLVM, and reveals that Optimizations are the buggiest component in both compilers except for the C++ component. Expand
How a simple bug in ML compiler could be exploited for backdoors?
TLDR
This study aims to show how a compiler-bug can be audited and possibly corrected, and shows that even old and mature compilers can present bugs. Expand
Well-typed programs can go wrong: a study of typing-related bugs in JVM compilers
TLDR
This study conducts the first empirical study for understanding and characterizing typing-related compiler bugs, and believes that it opens up a new research direction by driving future researchers to build appropriate methods and techniques for a more holistic testing of compilers. Expand
DeepFuzz: Automatic Generation of Syntax Valid C Programs for Fuzz Testing
TLDR
This paper proposes a grammarbased fuzzing tool called DEEPFUZZ, based on a generative Sequence-to-Sequence model, which automatically and continuously generates well-formed C programs and improves the testing efficacy in regards to the line, function, and branch coverage. Expand
Coverage Prediction for Accelerating Compiler Testing
TLDR
The novel approach to accelerating compiler testing through coverage prediction is called COP (short for COverage Prediction), and it is demonstrated that COP significantly accelerates compiler testing, achieving an average of 51.01 percent speedup in test execution time on an existing dataset including three old release versions of the compilers and a new dataset including 12 latest release versions. Expand
Finding Missed Compiler Optimizations by Differential Testing ∗ Gergö Barany
Randomized differential testing of compilers has had great success in finding compiler crashes and silent miscompilations. In this paper we investigate whether we can use similar techniques toExpand
Finding missed compiler optimizations by differential testing
TLDR
This paper investigates whether the quality of generated code can be improved by comparing the code generated by different compilers to find optimizations performed by one but missed by another, and develops a set of tools for running tests. Expand
Finding compiler bugs via live code mutation
TLDR
A novel EMI technique that allows mutation in the entire program (i.e., both live and dead regions) and significantly increases the EMI variant space by removing the restriction of mutating only the dead regions. Expand
...
1
2
3
4
...

References

SHOWING 1-10 OF 43 REFERENCES
Finding and understanding bugs in C compilers
TLDR
Csmith, a randomized test-case generation tool, is created and spent three years using it to find compiler bugs, and a collection of qualitative and quantitative results about the bugs it found are presented. Expand
How do fixes become bugs?
TLDR
A comprehensive characteristic study on incorrect bug-fixes from large operating system code bases including Linux, OpenSolaris, FreeBSD and also a mature commercial OS developed and evolved over the last 12 years, investigating not only themistake patterns during bug-fixing but also the possible human reasons in the development process when these incorrect bugs were introduced. Expand
Finding and Analyzing Compiler Warning Defects
  • Chengnian Sun, Vu Le, Z. Su
  • Computer Science
  • 2016 IEEE/ACM 38th International Conference on Software Engineering (ICSE)
  • 2016
TLDR
This paper proposes the first randomized differential testing technique to detect compiler warning defects and describes its extensive evaluation in finding warning defects in widely-used C compilers. Expand
Test-case reduction for C compiler bugs
TLDR
It is concluded that effective program reduction requires more than straightforward delta debugging, so three new, domain-specific test-case reducers are designed and implemented based on a novel framework in which a generic fixpoint computation invokes modular transformations that perform reduction operations. Expand
Characterizing and predicting which bugs get reopened
TLDR
This paper characterize when bug reports are reopened by using the Microsoft Windows operating system project as an empirical case study, and builds statistical models to describe the impact of various metrics on reopening bugs ranging from the reputation of the opener to how the bug was found. Expand
Compiler validation via equivalence modulo inputs
TLDR
This work introduces equivalence modulo inputs (EMI), a simple, widely applicable methodology for validating optimizing compilers, and profiles a program's test executions and stochastically prune its unexecuted code to create a practical implementation. Expand
Randomized stress-testing of link-time optimizers
TLDR
This work presents the first extensive effort to stress-test the LTO components of GCC and LLVM, the two most widely-used production C compilers and develops a practical mechanism to reduce LTO bugs involving multiple files. Expand
Have things changed now?: an empirical study of bug characteristics in modern open source software
TLDR
This study analyzes bug characteristics by first sampling hundreds of real world bugs in two large, representative open-source projects and finds several new interesting characteristics, including memory-related bugs have decreased and security bugs are increasing. Expand
An empirical study of operating systems errors
TLDR
A study of operating system errors found by automatic, static, compiler analysis applied to the Linux and OpenBSD kernels found that device drivers have error rates up to three to seven times higher than the rest of the kernel. Expand
Learning from mistakes: a comprehensive study on real world concurrency bug characteristics
TLDR
This study carefully examined concurrency bug patterns, manifestation, and fix strategies of 105 randomly selected real world concurrency bugs from 4 representative server and client open-source applications and reveals several interesting findings that provide useful guidance for concurrency Bug detection, testing, and concurrent programming language design. Expand
...
1
2
3
4
5
...