NEUZZ: Efficient Fuzzing with Neural Program Smoothing

@article{She2019NEUZZEF,
  title={NEUZZ: Efficient Fuzzing with Neural Program Smoothing},
  author={Dongdong She and Kexin Pei and Dave Epstein and Junfeng Yang and Baishakhi Ray and Suman Sekhar Jana},
  journal={2019 IEEE Symposium on Security and Privacy (SP)},
  year={2019},
  pages={803-817}
}
Fuzzing has become the de facto standard technique for finding software vulnerabilities. [] Key Method We further demonstrate that such neural network models can be used together with gradient-guided input generation schemes to significantly increase the efficiency of the fuzzing process. Our extensive evaluations demonstrate that NEUZZ significantly outperforms 10 state-of-the-art graybox fuzzers on 10 popular real-world programs both at finding new bugs and achieving higher edge coverage. NEUZZ found 31…
Evaluating and Improving Neural Program-Smoothing-based Fuzzing
TLDR
A simplistic technique is proposed, PreFuzz, which improves neural program-smoothing-based fuzzers with a resource-efficient edge selection mechanism to enhance their gradient guidance and a probabilistic byte selection mechanismTo further boost mutation effectiveness, and can significantly increase the edge coverage of Neuzz/MTFuzz.
LAFuzz: Neural Network for Efficient Fuzzing
TLDR
LAFuzz is presented, an automated fuzzer that generates high-quality seed inputs, which utilizes a variety of deep neural network model with different setup to efficiently fuzz programs that expect structured or unstructured inputs.
MTFuzz: fuzzing with a multi-task neural network
TLDR
A Multi-Task Neural Network is used that can learn a compact embedding of the input space based on diverse training samples for multiple related tasks that can guide the mutation process by focusing most of the mutations on the parts of the embedding where the gradient is high.
MaxAFL: Maximizing Code Coverage with a Gradient-Based Optimization Technique
TLDR
This paper proposes a new type of gradient-based fuzzer, MaxAFL, to overcome the limitations of existing gradient- based fuzzers, and introduces an adaptive objective function which aims to explore various paths in the program.
A First Look at the Effect of Deep Learning in Coverage-guided Fuzzing
  • Siqi Li, Yun Lin, J. Dong
  • Computer Science
    2021 36th IEEE/ACM International Conference on Automated Software Engineering (ASE)
  • 2021
TLDR
The empirical results reveal that the deep learning models can only be effective in very limited scenarios, which is largely restrained by training data imbalance, dependant labels, model over-generalization, and the insufficient expressiveness of the state-of-the-art models.
EMS: History-Driven Mutation for Coverage-based Fuzzing
  • Chenyang Lyu, S. Ji, R. Beyah
  • Computer Science
    Proceedings 2022 Network and Distributed System Security Symposium
  • 2022
TLDR
A novel history-driven mutation framework named EMS is presented that employs PBOM as one of the mutation operators to probabilistically provide desired mutation byte values according to the input ones and discovers up to 4.91 × more unique vulnerabilities than the baseline, and more line coverage than other fuzzers on most programs.
ParmeSan: Sanitizer-guided Greybox Fuzzing
TLDR
This paper presents sanitizer-guided fuzzing, a new design point in this space that specifically optimizes for bug coverage, and shows that ParmeSan greatly reduces the TTE of real-world bugs, and finds bugs 37% faster than existing state-of-the-art coverage-based fuzzers (Angora) and 288% faster more than directed fuzzing (AFLGo), while still covering the same set of bugs.
Improving the Effectiveness of Grey-box Fuzzing By Extracting Program Information
TLDR
A greybox fuzzer called MuFuzzer based on AFL is proposed, which incorporates two heuristics that optimize seed selection and automatically extract input formatting information from the PUT to increase the chance of generating valid test inputs, respectively.
CMFuzz: context-aware adaptive mutation for fuzzers
TLDR
CMFuzz, a novel context-aware adaptive mutation scheme, is proposed, which utilizes a contextual bandit algorithm LinUCB to effectively choose optimal mutation operators for various seed files and achieves higher code coverage and find more crashes at a faster rate than their counterparts on most cases.
EPfuzzer: Improving Hybrid Fuzzing with Hardest-to-reach Branch Prioritization
TLDR
EPfuzzer implements two key ideas: only the hardest-to-reach branch will be prioritized for concolic execution to avoid generating uninteresting inputs and only input bytes relevant to the target branch to be flipped will be symbolized to reduce the overhead of the symbolic emulation.
...
...

References

SHOWING 1-10 OF 91 REFERENCES
Angora: Efficient Fuzzing by Principled Search
  • Peng Chen, Hao Chen
  • Computer Science
    2018 IEEE Symposium on Security and Privacy (SP)
  • 2018
TLDR
This work proposes Angora, a new mutation-based fuzzer that outperforms the state-of-the-art fuzzers by a wide margin, and introduces several key techniques: scalable byte-level taint tracking, context-sensitive branch count, search based on gradient descent, and input length exploration.
Not all bytes are equal: Neural byte sieve for fuzzing
TLDR
This work implements several neural models including LSTMs and sequence-to-sequence models that can encode variable length input files and incorporates them in the state-of-the-art AFL (American Fuzzy Lop) fuzzer and shows significant improvements in terms of code coverage, unique code paths, and crashes.
Steelix: program-state based binary fuzzing
TLDR
A program-state based binary fuzzing approach, named Steelix, which improves the penetration power of a fuzzer at the cost of an acceptable slow down of the execution speed and has better code coverage and bug detection capability than the state-of-the-art fuzzers.
Automated Whitebox Fuzz Testing
TLDR
This work presents an alternative whitebox fuzz testing approach inspired by recent advances in symbolic execution and dynamic test generation, and implemented this algorithm in SAGE (Scalable, Automated, Guided Execution), a new tool employing x86 instruction-level tracing and emulation for white box fuzzing of arbitrary file-reading Windows applications.
T-Fuzz: Fuzzing by Program Transformation
TLDR
T-Fuzz leverages a symbolic execution-based approach to filter out false positives and reproduce true bugs in the original program by transforming the program as well as mutating the input, and covers more code and finds more true bugs than any existing technique.
Learn&Fuzz: Machine learning for input fuzzing
TLDR
This paper shows how to automate the generation of an input grammar suitable for input fuzzing using sample inputs and neural-network-based statistical machine-learning techniques and presents a new algorithm for this learn&fuzz challenge which uses a learnt input probability distribution to intelligently guide where to fuzz inputs.
Driller: Augmenting Fuzzing Through Selective Symbolic Execution
TLDR
Driller is presented, a hybrid vulnerability excavation tool which leverages fuzzing and selective concolic execution in a complementary manner, to find deeper bugs and mitigate their weaknesses, avoiding the path explosion inherent in concolic analysis and the incompleteness of fuzzing.
KameleonFuzz: evolutionary fuzzing for black-box XSS detection
TLDR
KameleonFuzz is proposed, a black-box Cross Site Scripting (XSS) fuzzer for web applications that can not only generate malicious inputs to exploit XSS, but also detect how close it is revealing a vulnerability.
Deep Reinforcement Fuzzing
TLDR
This paper formalizes fuzzing as a reinforcement learning problem using the concept of Markov decision processes, which allows for state-of-the-art deep Q-learning algorithms that optimize rewards, which are defined from runtime properties of the program under test.
Program-Adaptive Mutational Fuzzing
TLDR
The design of an algorithm to maximize the number of bugs found for black-box mutational fuzzing given a program and a seed input is presented, and the result is promising: it finds an average of 38.6% more bugs than three previous fuzzers over 8 applications using the same amount of fuzzing time.
...
...