DeepCVA: Automated Commit-level Vulnerability Assessment with Deep Multi-task Learning

@article{Le2021DeepCVAAC,
  title={DeepCVA: Automated Commit-level Vulnerability Assessment with Deep Multi-task Learning},
  author={Triet Huynh Minh Le and David Hin and Roland Croft and Muhammad Ali Babar},
  journal={2021 36th IEEE/ACM International Conference on Automated Software Engineering (ASE)},
  year={2021},
  pages={717-729}
}
  • T. H. LeDavid Hin M. Babar
  • Published 18 August 2021
  • Computer Science
  • 2021 36th IEEE/ACM International Conference on Automated Software Engineering (ASE)
It is increasingly suggested to identify Software Vulnerabilities (SVs) in code commits to give early warnings about potential security risks. However, there is a lack of effort to assess vulnerability-contributing commits right after they are detected to provide timely information about the exploitability, impact and severity of SVs. Such information is important to plan and prioritize the mitigation for the identified SVs. We propose a novel Deep multi-task learning model, DeepCVA, to… 

Figures and Tables from this paper

On the Use of Fine-grained Vulnerable Code Statements for Software Vulnerability Assessment Models

  • T. H. LeM. Babar
  • Computer Science
    2022 IEEE/ACM 19th International Conference on Mining Software Repositories (MSR)
  • 2022
This work investigates ML models for automating function-level SV assessment tasks, i.e., predicting seven Common Vulnerability Scoring System (CVSS) metrics and studies the value and use of vulnerable statements as inputs for developing the assessment models because SVs in functions are originated in these statements.

A Survey on Data-driven Software Vulnerability Assessment and Prioritization

A survey provides a taxonomy of the past research efforts and highlights the best practices for data-driven SV assessment and prioritization and discusses the current limitations and propose potential solutions to address such issues.

VulCurator: a vulnerability-fixing commit detector

VulCurator is a tool that leverages deep learning on richer sources of information, including commit messages, code changes and issue reports for vulnerability-fixing commit classification, and experimental results show that VulCurator outperforms the state-of-the-art baselines up to 16.1% in terms of F1-score.

Noisy Label Learning for Security Defects

A two-stage learning method based on noise cleaning to identify and remediate the noisy samples, which improves AUC and recall of baselines by up to 8.9% and 23.4%, respectively and shows that learning from noisy labels can be effective for data-driven software and security analytics.

HERMES: Using Commit-Issue Linking to Detect Vulnerability-Fixing Commits

This work proposes a machine learning approach to automatically identify vulnerability-fixing commits, which incorporates the rich source of information from issue trackers and uses a commit-issue link recovery technique to infer the potential missing link.

An Investigation into Inconsistency of Software Vulnerability Severity across Data Sources

  • Roland CroftM. BabarLi Li
  • Computer Science, Psychology
    2022 IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER)
  • 2022
This study investigates severity ranking inconsistencies over the SV reporting lifecycle and identifies six potential attributes that are correlated to this misjudgment, and shows that inconsistency in severity reporting schemes can severely degrade the performance of downstream severity prediction by up to 77%.

Nimbus: Toward Speed Up Function Signature Recovery via Input Resizing and Multi-Task Learning

This paper proposes a method called Nimbus for function signature recovery that furthest reduces the whole-process resource consumption without performance loss, and utilizes selective inputs and introduces multi-task learning (MTL) structure for function signatures recovery to reduce computational resource consumption, and fully leverage mutual information.

Automated Security Assessment for the Internet of Things

An automated security assessment framework for IoT networks that leverages machine learning and natural language processing to analyze vulnerability descriptions for predicting vulnerability metrics and identifying the most vulnerable attack paths within an IoT network is proposed.

References

SHOWING 1-10 OF 90 REFERENCES

Joint Prediction of Multiple Vulnerability Characteristics Through Multi-Task Learning

A multi-task machine learning approach for the joint prediction of multiple vulnerability characteristics based on the vulnerability descriptions that gets rid of the requirement of balanced data, and it relies on neural networks that learn to extract features from training data.

Learning to Predict Severity of Software Vulnerability Using Only Vulnerability Description

This paper proposes a deep learning approach to predict multi-class severity level of software vulnerability using only vulnerability description, and uses word embeddings and a one-layer shallow Convolutional Neural Network to automatically capture discriminative word and sentence features of vulnerability descriptions for predicting vulnerability severity.

A Comparative Study of Deep Learning-Based Vulnerability Detection System

This paper collects two datasets from the programs involving 126 types of vulnerabilities and conducts the first comparative study to quantitatively evaluate the impact of different factors on the effectiveness of vulnerability detection.

Cross-Project Transfer Representation Learning for Vulnerable Function Discovery

Compared with the traditional code metrics, the transfer-learned representations are more effective for predicting vulnerable functions, both within a project and across multiple projects.

Large-Scale Empirical Studies on Effort-Aware Security Vulnerability Prediction Methods

Empirical results show that two unsupervised methods [i.e., lines of code (LOC) and Halstead's volume (HV)] and four recently proposed state-of-the-art supervised methods can achieve better performance than the other methods in terms of effort-aware performance measures.

A Survey on Data-driven Software Vulnerability Assessment and Prioritization

A survey provides a taxonomy of the past research efforts and highlights the best practices for data-driven SV assessment and prioritization and discusses the current limitations and propose potential solutions to address such issues.

Automated Software Vulnerability Assessment with Concept Drift

The proposed systematic approach can effectively tackle the concept drift issue of the SVs' descriptions reported from 2000 to 2018 in NVD even without retraining the model and performs competitively compared to the existing word-only method.

VulDigger: A Just-in-Time and Cost-Aware Tool for Digging Vulnerability-Contributing Changes

A cost- aware vulnerability prediction model is focused on and a just-in-time change-level code review tool called VulDigger is presented to dig suspicious ones from a sea of code changes to assist as an early step of continuous security inspections.

DeepJIT: An End-to-End Deep Learning Framework for Just-in-Time Defect Prediction

This paper proposes an end-to-end deep learning framework, named DeepJIT, that automatically extracts features from commit messages and code changes and use them to identify defects.

Predicting Exploitation of Disclosed Software Vulnerabilities Using Open-source Data

This study replicates key portions of the prior work, compares their approaches, and shows how selection of training and test data critically affect the estimated performance of predictive models.
...