A survey of data mining techniques for malware detection using file features

@inproceedings{Siddiqui2008ASO,
  title={A survey of data mining techniques for malware detection using file features},
  author={Muazzam Ahmed Siddiqui and Morgan C. Wang and Joohan Lee},
  booktitle={ACM-SE 46},
  year={2008}
}
This paper presents a survey of data mining techniques for malware detection using file features. The techniques are categorized based upon a three tier hierarchy that includes file features, analysis type and detection type. File features are the features extracted from binary programs, analysis type is either static or dynamic, and the detection type is borrowed from intrusion detection as either misuse or anomaly detection. It provides the reader with the major advancement in the malware… 

Tables from this paper

A state-of-the-art survey of malware detection approaches using data mining techniques
TLDR
A systematic and detailed survey of the malware detection mechanisms using data mining techniques and classifies the malware Detection approaches in two main categories including signature-based methods and behavior-based detection.
Various Data Mining Techniques to Detect the Android Malware Applications: A Case Study
  • Rincy Raphael
  • Computer Science
    International Journal of New Technology and Research
  • 2019
TLDR
A survey of various datamining techniques conducted to analyse and detect the android malware applications is conducted, analysing the classification algorithm used, dataset size and accuracy of the system.
Survey on Representation Techniques for Malware Detection System
TLDR
This review paper provides a detailed discussion and full reviews for various types of malware, malware detection techniques, various researches on them, malware analysis methods and different dynamic programming-based tools that could be used to represent the malware sampled.
On the Comparison of Malware Detection Methods Using Data Mining with Two Feature Sets
TLDR
From the comparison experiments, it is found that the approach that considers the instruction set feature performs better and the test with the application set can give up to 100% correctness using the instructions.
Comparative and Analysis Study for Malicious Executable by Using Various Classification Algorithms
TLDR
This research is presenting a data mining classification procedure through applying machine learning algorithms to detect malicious executable files, and this study will investigate the approach of classification in some algorithms such as (Support Vector Machine, Random Forest, KNN (k-Nearest Neighbors Classifier), and The Hoeffding Tree).
A Survey on Mining Program-Graph Features for Malware Analysis
TLDR
This paper presents a state of the art survey on mining program-graph features for malware detection and outlined the challenges of malware detection based onmining program graph features for its successful deployment, and opportunities that can be explored in the future.
Malware detection based on hybrid signature behavior application programming interface call graph
TLDR
A new malware detection framework is proposed that combines Signature-Based with Behaviour-Based using API graph system and aims to improve accuracy and scan process time for malware detection.
A Survey on Malware Detection Using Data Mining Techniques
TLDR
There is an urgent need to develop intelligent methods for effective and efficient malware detection from the real and large daily sample collection and a comprehensive investigation on both the feature extraction and the classification/clustering techniques is provided.
Enhancing Malware Detection Accuracy through Graph Based Model
TLDR
This paper suggests a novel malware revealing model based on graph model by capturing system calls during the execution of a suspected executable that has better detection accuracy rate and solves the scalability problem when it is compared to existing methods.
Heuristic malware detection mechanism based on executable files static analysis
TLDR
The hypothesis about the possibility of constructing the heuristic malware analyzer on the basis of features distinguished during the static analysis of the executable files is considered, however, the approach based on the decision tree composition enables to obtain a significantly lower false negative rate probability with the speci⬁ed initial data and classi�ary parameter values relating to neural networks.
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 24 REFERENCES
Data mining methods for malware detection using instruction sequences
TLDR
A novel idea of automatically identifying critical instruction sequences that can classify between malicious and clean programs using data mining techniques is presented, formulated as a binary classification problem and built logistic regression, neural networks and decision tree models.
Data mining methods for malware detection
TLDR
This research investigates the use of data mining methods for malware (malicious programs) detection and proposed a framework as an alternative to the traditional signature detection methods, using a vector space model to represent the programs in the collection.
Detection of New Malicious Code Using N-grams Signatures
TLDR
This work employs n-grams analysis to automatically generate signatures from malicious and benign software collections, capable of classifying unseen benign and malicious code.
IMDS: intelligent malware detection system
TLDR
Promising experimental results demonstrate that the accuracy and efficiency of the IMDS system out perform popular anti-virus software such as Norton AntiVirus and McAfee VirusScan, as well as previous data mining based detection systems which employed Naive Bayes, Support Vector Machine and Decision Tree techniques.
Data mining methods for detection of new malicious executables
TLDR
This work presents a data mining framework that detects new, previously unseen malicious executables accurately and automatically and more than doubles the current detection rates for new malicious executable.
A Feature Selection and Evaluation Scheme for Computer Virus Detection
TLDR
This paper presents a data mining approach that conducts an exhaustive feature search on a set of computer viruses and strives to obviate over-fitting, and evaluates the predictive power of a classifier by taking into account dependence relationships that exist between viruses.
N-gram-based detection of new malicious code
TLDR
This work explores the idea of automatically detecting new malicious code using the collected dataset of the benign and malicious code, and obtained accuracy of 100% in the training data, and 98% in 3-fold cross-validation.
A scalable multi-level feature extraction technique to detect malicious executables
TLDR
This work proposes a novel combination of three different kinds of features at different levels of abstraction; binary n-grams, assembly instruction sequences, and Dynamic Link Library (DLL) function calls; extracted from binary executables, disassembled executable, and executable headers, respectively.
Static analyzer of vicious executables (SAVE)
TLDR
This paper presents a robust signature-based malware detection technique, with emphasis on detecting obfuscated malware and mutated (or metamorphic) malware.
A toolkit for detecting and analyzing malicious software
TLDR
This work presents the investigation into structural feature analysis, the development of these ideas into the PEAT prototype, and results that illustrate PEAT's practical effectiveness.
...
1
2
3
...