Static Malware Detection & Subterfuge: Quantifying the Robustness of Machine Learning and Current Anti-Virus

@article{Fleshman2018StaticMD,
  title={Static Malware Detection \& Subterfuge: Quantifying the Robustness of Machine Learning and Current Anti-Virus},
  author={William Fleshman and Edward Raff and Richard Zak and Mark McLean and Charles K. Nicholas},
  journal={2018 13th International Conference on Malicious and Unwanted Software (MALWARE)},
  year={2018},
  pages={1-10}
}
As machine-learning (ML) based systems for malware detection become more prevalent, it becomes necessary to quantify the benefits compared to the more traditional anti-virus (AV) systems widely used today. It is not practical to build an agreed upon test set to benchmark malware detection systems on pure classification performance. Instead we tackle the problem by creating a new testing methodology, where we evaluate the change in performance on a set of known benign & malicious files as… 

Figures and Tables from this paper

Adversarial Attacks against Windows PE Malware Detection: A Survey of the State-of-the-Art
TLDR
A comprehensive and systematic review is conducted to categorize the state-of-the-art adversarial attacks against PE malware detection, as well as corresponding defenses to increase the robustness of WindowsPE malware detection.
Automatic Generation of Adversarial Examples for Interpreting Malware Classifiers
TLDR
This paper proposes new adversarial attacks against real-world antivirus systems based on code randomization and binary manipulation, and uses this framework to perform the attacks on 1000 malware samples and test four commercial antivirus software and two open-source classifiers.
A Comparison of State-of-the-Art Techniques for Generating Adversarial Malware Binaries
TLDR
This work evaluated three recent adversarial malware generation techniques using binary malware samples drawn from a single, publicly available malware data set and compared their performances for evading a machine-learning based malware classifier called MalConv.
Poster: Feasibility of Malware Visualization Techniques against Adversarial Machine Learning Attacks
TLDR
The motivation of this work is to develop a practical method to quickly and efficiently detect malware in a way that is robust against adversarial ML attacks and does not require costly adversarial defense mechanisms.
MAB-Malware: A Reinforcement Learning Framework for Blackbox Generation of Adversarial Malware
TLDR
A black-box Reinforcement Learning (RL) based framework to generate AEs for PE malware classifiers and AV engines that regards the adversarial attack problem as a multi-armed bandit problem, which finds an optimal balance between exploiting the successful patterns and exploring more varieties.
Evading API Call Sequence Based Malware Classifiers
TLDR
A mimicry attack to transform malware binary, which can evade detection by API call sequence based malware classifier, is presented and it is shown that adversarial retraining can make malware classifiers robust against such evasion attacks.
Evading Malware Classifiers via Monte Carlo Mutant Feature Discovery
TLDR
This experiment stages a grey box setup to analyze a scenario where the malware author does not know the target classifier algorithm, and does not have access to decisions made by the classifier, but knows the features used in training.
Stealing Malware Classifiers and AVs at Low False Positive Conditions
TLDR
A new neural network architecture for surrogate models is proposed that outperforms the existing state of the art on low FPR conditions and shows that surrogate models could generate adversarial samples that evade the targets but are less successful than the targets themselves.
An Efficient Approach For Malware Detection Using PE Header Specifications
TLDR
To identify malware programs, features extracted based on the header and PE file structure are used to train several machine learning models and the proposed method identifies malware programs with 95.59% accuracy using only nine features.
A Survey on Adversarial Attacks for Malware Analysis
TLDR
This survey aims at providing encyclopedic introduction to adversarial attacks that are carried out against malware detection systems and Analyzing the current research challenges in an adversarial generation, which will conclude by pinpointing the open future research directions.
...
...

References

SHOWING 1-10 OF 44 REFERENCES
Evading Machine Learning Malware Detection
TLDR
A more general framework for attacking static PE anti-malware engines based on reinforcement learning is investigated, which models more realistic attacker conditions, and subsequently has provides much more modest evasion rates.
Adversarial Malware Binaries: Evading Deep Learning for Malware Detection in Executables
TLDR
This work proposes a gradient-based attack that is capable of evading a recently-proposed deep network suited to this purpose by only changing few specific bytes at the end of each mal ware sample, while preserving its intrusive functionality.
Evasion Attacks against Machine Learning at Test Time
TLDR
This work presents a simple but effective gradient-based approach that can be exploited to systematically assess the security of several, widely-used classification algorithms against evasion attacks.
Testing malware detectors
TLDR
A technique based on program obfuscation is presented, geared towards evaluating the resilience of malware detectors to various obfuscation transformations commonly used by hackers to disguise malware, and it is discovered that these scanners are very poor.
A survey on automated dynamic malware-analysis techniques and tools
TLDR
An overview of techniques based on dynamic analysis that are used to analyze potentially malicious samples and analysis programs that employ these techniques to assist human analysts in assessing whether a given sample deserves closer manual inspection due to its unknown malicious behavior is provided.
From Malware Signatures to Anti-Virus Assisted Attacks
TLDR
A novel method for automatically deriving signatures from anti-virus software is presented and it is demonstrated how the extracted signatures can be used to attack sensible data with the aid of the virus scanner itself.
Secure Kernel Machines against Evasion Attacks
TLDR
This work aims to develop secure kernel machines against evasion attacks that are not computationally more demanding than their non-secure counterparts, and discusses the security of nonlinear kernel machines, and shows that a proper choice of the kernel function is crucial.
Transcend: Detecting Concept Drift in Malware Classification Models
TLDR
This work proposes Transcend, a framework to identify aging classification models in vivo during deployment, much before the machine learning model’s performance starts to degrade, a significant departure from conventional approaches that retrain aging models retrospectively when poor performance is observed.
Prescience: Probabilistic Guidance on the Retraining Conundrum for Malware Detection
TLDR
Results obtained show that the proposed Venn-Abers predictors framework can identify when models tend to become obsolete, and can be used for building better retraining strategies in the presence of concept drift.
Deep neural network based malware detection using two dimensional binary program features
TLDR
A deep neural network based malware detection system that Invincea has developed is introduced, which achieves a usable detection rate at an extremely low false positive rate and scales to real world training example volumes on commodity hardware.
...
...