Malware Detection in PDF and Office Documents: A survey
@article{Singh2020MalwareDI, title={Malware Detection in PDF and Office Documents: A survey}, author={Priyanshi Singh and Shashikala Tapaswi and Sanchit Gupta}, journal={Information Security Journal: A Global Perspective}, year={2020}, volume={29}, pages={134 - 153} }
ABSTRACT In 2018, with the internet being treated as a utility on equal grounds as clean water or air, the underground malicious software economy is flourishing with an influx of growth and sophistication in the attacks. The use of malicious documents has increased rapidly in the last five years along with a spectrum of attacks. They offer flexibility in document structure with numerous features for attackers to exploit. Despite efforts from industry and research communities, this remains a…
9 Citations
Detection of macro-based attacks in office documents using Machine Learning
- Computer Science
- 2021
A broad classification of macro based malicious document attack is provided along with a detailed description of the attack opportunities available using office documents and a hybrid malware analysis technique is proposed which thoroughly analyze the file for any macro attacks.
Analysis and Correlation of Visual Evidence in Campaigns of Malicious Office Documents
- Computer ScienceDigital Threats: Research and Practice
- 2022
This article proposes a mechanism to extract and analyse the different components of the files, including these visual elements, and construct lightweight signatures based on them, and test and validate the approach using an extensive database of malware samples, obtaining accuracy above 99% in the task of distinguishing between benign and malicious files.
HAPSSA: Holistic Approach to PDF malware detection using Signal and Statistical Analysis
- Computer ScienceMILCOM 2021 - 2021 IEEE Military Communications Conference (MILCOM)
- 2021
This paper derives a simple yet effective holistic approach to PDF malware detection that leverages signal and statistical analysis of malware binaries and shows that this holistic approach maintains a high detection rate of PDF malware and even detects new malicious files created by simple methods.
Toward Robust Classifiers for PDF Malware Detection
- Computer ScienceComputers, Materials & Continua
- 2021
This study proposes two models for PDF malware detection that can distinguish the different vulnerabilities exploited in malicious files and achieve excellent performance in terms of generalization ability, accuracy, and robustness.
Detecting malicious PDF using CNN
- Computer ScienceArXiv
- 2020
This work proposes a novel algorithm that uses an ensemble of Convolutional Neural Network on the byte level of the file, without any handcrafted features to maintain a high detection rate of PDF malware and even detects new malicious files, still undetected by most antiviruses.
Invasive weed optimization with stacked long short term memory for PDF malware detection and classification
- Computer ScienceInternational journal of health sciences
- 2022
An Invasive Weed Optimization with Stacked Long Short Term Memory (IWO-S-LSTM) technique for PDF malware detection and classification and the experimental outcomes outperformed the promising performance of the IWO -S- LSTM technique on the other approaches.
An Improved Method of Detecting Macro Malware on an Imbalanced Dataset
- Computer ScienceIEEE Access
- 2020
This paper proposes an improved method of detecting macro malware on an imbalanced dataset that mitigates the class imbalance problem and could detect completely new malware regardless of the family type and reveals that LSI is more robust than Doc2vec to theclass imbalance problem.
Invoice #31415 attached: Automated analysis of malicious Microsoft Office documents
- Computer ScienceComput. Secur.
- 2022
Design of a Fused Triple Convolutional Neural Network for Malware Detection: A Visual Classification Approach
- Computer ScienceCommunications in Computer and Information Science
- 2021
References
SHOWING 1-10 OF 94 REFERENCES
Identifying Drawbacks in Malicious PDF Detectors
- Computer ScienceFNSS
- 2018
A survey of all recent malicious PDF detectors, followed by a comparative evaluation of the available tools shows that Concept drifts is major drawback to the detectors, despite the fact that many detectors use machine learning approaches.
BISSAM: Automatic Vulnerability Identification of Office Documents
- Computer ScienceDIMVA
- 2012
This paper presents a novel approach to detect and identify the actual vulnerability exploited by a malicious document and extract the exploit code itself from a security patch.
PDF Scrutinizer: Detecting JavaScript-based attacks in PDF documents
- Computer Science2012 Tenth Annual International Conference on Privacy, Security and Trust
- 2012
This paper uses static, as well as, dynamic techniques to detect malicious behavior in an emulated environment, and shows that PDF Scrutinizer reliably detects current malicious documents, while keeping a low false-positive rate and reasonable runtime performance.
Detection of malicious PDF files and directions for enhancements: A state-of-the art survey
- Computer ScienceComput. Secur.
- 2015
A survey on malware propagation, analysis, and detection
- Computer Science
- 2013
A detailed review has been conducted on the current situation of malware infection and the work done to improve anti-malware or malware detection systems and provides an up-to-date comparative reference for developers of malware detection systems.
Static detection of malicious JavaScript-bearing PDF documents
- Computer ScienceACSAC '11
- 2011
This contribution presents a technique for detection of JavaScript-bearing malicious PDF documents based on static analysis of extracted JavaScript code that has proved to be effective against both known and unknown malware and suitable for large-scale batch processing.
A Survey on Malware and Malware Detection Systems
- Computer Science
- 2013
A detailed review has been conducted on the current situation of malware infection and the work done to improve anti-malware or malware detection systems and provides an up-to-date comparative reference for developers of malware detection system.
Hidost: a static machine-learning-based detector of malicious files
- Computer ScienceEURASIP J. Inf. Secur.
- 2016
Hidost is introduced, the first static machine-learning-based malware detection system designed to operate on multiple file formats and outperformed all antivirus engines deployed by the website VirusTotal to detect the highest number of malicious PDF files and ranked among the best on SWF malware.
SFEM: Structural feature extraction methodology for the detection of malicious office documents using machine learning methods
- Computer ScienceExpert Syst. Appl.
- 2016
A Pattern Recognition System for Malicious PDF Files Detection
- Computer ScienceMLDM
- 2012
An innovative technique, which combines a feature extractor module strongly related to the structure of PDF files and an effective classifier, is presented, which has proven to be more effective than other state-of-the-art research tools for malicious PDF detection.