VulDeeLocator: A Deep Learning-Based Fine-Grained Vulnerability Detector

@article{Li2022VulDeeLocatorAD,
  title={VulDeeLocator: A Deep Learning-Based Fine-Grained Vulnerability Detector},
  author={Zhuguo Li and Deqing Zou and Shouhuai Xu and Zhaoxuan Chen and Yawei Zhu and Hai Jin},
  journal={IEEE Transactions on Dependable and Secure Computing},
  year={2022},
  volume={19},
  pages={2821-2837}
}
  • Zhuguo Li, Deqing Zou, Hai Jin
  • Published 8 January 2020
  • Computer Science
  • IEEE Transactions on Dependable and Secure Computing
Automatically detecting software vulnerabilities is an important problem that has attracted much attention from the academic research community. However, existing vulnerability detectors still cannot achieve the vulnerability detection capability and the locating precision that would warrant their adoption for real-world use. In this article, we present a vulnerability detector that can simultaneously achieve a high detection capability and a high locating precision, dubbed Vulnerability Deep… 

An Information-Theoretic and Contrastive Learning-based Approach for Identifying Code Statements Causing Software Vulnerability

TLDR
A novel end-to-end deep learning-based approach to identify the vulnerability-relevant code statements of a specific function that obtains a higher performance in VCP, VCA, and Top-10 ACC measures when running on real-world datasets in an unsupervised setting.

Investigating the impact of vulnerability datasets on deep learning-based vulnerability detectors

TLDR
This work uses sample granularity, sample similarity, and code features to characterize vulnerability datasets, and analyzes the correlation between the characteristics of vulnerability datasets and the results of DL-based vulnerability detectors.

Path-sensitive code embedding via contrastive learning for software vulnerability detection

TLDR
The novelty of ContraFlow lies in selecting and preserving feasible value-flow paths through a pretrained path embedding model using self-supervised contrastive learning, thus significantly reducing the amount of labeled data required for training expensive downstream models for path-based vulnerability detection.

A Vulnerability Detection System Based on Fusion of Assembly Code and Source Code

TLDR
This paper implements a vulnerability detection system by combining source code and assembly code models, and shows that the system presents a stable and convergent detection performance.

Software Vulnerability Analysis and Discovery Using Deep Learning Techniques: A Survey

TLDR
This paper identifies four game changers that significantly impact the domain of deep learning-based vulnerability detection and provides detailed reviews of the insights, ideas, and concepts that the game changer have brought to this field of interest.

Open Science in Software Engineering: A Study on Deep Learning-Based Vulnerability Detection

TLDR
This study exhaustively searched the literature in this area and identified 55 relevant works that propose a DL-based vulnerability detection approach, followed by comprehensively investigating the four integral aspects of open science: availability, executability, reproducibility , and replicability.

MANDO: Multi-Level Heterogeneous Graph Embeddings for Fine-Grained Detection of Smart Contract Vulnerabilities

TLDR
MANDO is the first learning- based approach capable of identifying vulnerabilities at the coarse-grained line-level and improves the traditional code analysis-based vulnerability detection approaches by 11.35% to 70.81% in terms of F1-score.

BlockScope: Detecting and Investigating Propagated Vulnerabilities in Forked Blockchain Projects

TLDR
This paper proposes BlockScope, a novel tool that can effectively and efficiently detect multiple types of cloned vulnerabilities given an input of existing Bitcoin/Ethereum security patches, and reveals three types of vulnerability propagation from source to forked projects.

No More Fine-Tuning? An Experimental Evaluation of Prompt Tuning in Code Intelligence

Pre-trained models have been shown effective in many code intelligence tasks. These models are pre-trained on large-scale unlabeled corpus and then fine-tuned in downstream tasks. However, as the

References

SHOWING 1-10 OF 48 REFERENCES

Saluki: Finding Taint-style Vulnerabilities with Static Property Checking

TLDR
Saluki provides a domain specific language for expressing taint-based policies to express security properties as formal specifications and includes a decidable solver procedure to prove whether a set of data dependence facts satisfy a security property.

Modeling and Discovering Vulnerabilities with Code Property Graphs

TLDR
This paper introduces a novel representation of source code called a code property graph that merges concepts of classic program analysis, namely abstract syntax trees, control flow graphs and program dependence graphs, into a joint data structure that enables it to elegantly model templates for common vulnerabilities with graph traversals that can identify buffer overflows, integer overflOWS, format string vulnerabilities, or memory disclosures.

<inline-formula><tex-math notation="LaTeX">$\mu$</tex-math><alternatives><mml:math><mml:mi>μ</mml:mi></mml:math><inline-graphic xlink:href="zou-ieq1-2942930.gif"/></alternatives></inline-formula>VulDeePecker: A Deep Learning-Based System for Multiclass Vulnerability Detection

TLDR
Experimental results show that $\mu$VulDeePecker is effective for multiclass vulnerability detection and that accommodating control-dependence (other than data-Dependence) can lead to higher detection capabilities.

SySeVR: A Framework for Using Deep Learning to Detect Software Vulnerabilities

TLDR
This work proposes the first systematic framework for using deep learning to detect vulnerabilities in C/C++ programs with source code, and focuses on obtaining program representations that can accommodate syntax and semantic information pertinent to vulnerabilities.

VulDeePecker: A Deep Learning-Based System for Vulnerability Detection

TLDR
The study of using deep learning-based vulnerability detection to relieve human experts from the tedious and subjective task of manually defining features and Experimental results show that VulDeePecker can achieve much fewer false negatives and reasonable false positives than other approaches.

On the Properties of Neural Machine Translation: Encoder–Decoder Approaches

TLDR
It is shown that the neural machine translation performs relatively well on short sentences without unknown words, but its performance degrades rapidly as the length of the sentence and the number of unknown words increase.

Long Short-Term Memory

TLDR
A novel, efficient, gradient based method called long short-term memory (LSTM) is introduced, which can learn to bridge minimal time lags in excess of 1000 discrete-time steps by enforcing constant error flow through constant error carousels within special units.

Instruction2vec: Efficient Preprocessor of Assembly Code to Detect Software Weakness with CNN

TLDR
Experimental results show that the proposed scheme can detect software vulnerabilities with an accuracy of 91% of the assembly code, and a new method—Instruction2vec—an improved static binary analysis technique using machine.

Devign: Effective Vulnerability Identification by Learning Comprehensive Program Semantics via Graph Neural Networks

TLDR
Inspired by the work on manually-defined patterns of vulnerabilities from various code representation graphs and the recent advance on graph neural networks, Devign is proposed, a general graph neural network based model for graph-level classification through learning on a rich set of code semantic representations.

VulSniper: Focus Your Attention to Shoot Fine-Grained Vulnerabilities

TLDR
This paper proposes VulSniper, which is designed to detect fine-grained vulnerabilities more effectively and achieves F1-scores of 80.6% and 73.3% on the two benchmark datasets respectively, which are significantly higher than those of the state-of-the-art methods.