Data mining methods for detection of new malicious executables

@article{Schultz2001DataMM,
  title={Data mining methods for detection of new malicious executables},
  author={Matthew G. Schultz and Eleazar Eskin and Erez Zadok and S. Stolfo},
  journal={Proceedings 2001 IEEE Symposium on Security and Privacy. S\&P 2001},
  year={2001},
  pages={38-49}
}
  • M. Schultz, E. Eskin, S. Stolfo
  • Published 14 May 2001
  • Computer Science
  • Proceedings 2001 IEEE Symposium on Security and Privacy. S&P 2001
A serious security threat today is malicious executables, especially new, unseen malicious executables often arriving as email attachments. [] Key Method The data mining framework automatically found patterns in our data set and used these patterns to detect a set of new malicious binaries. Comparing our detection methods with a traditional signature-based method, our method more than doubles the current detection rates for new malicious executables.

Figures and Tables from this paper

Security Applications for Malicious Code Detection Using Data Mining
Data mining is the process of posing queries and extracting patterns, often previously unknown from large quantities of data using pattern matching or other reasoning techniques. Data mining has many
Data mining methods for malware detection using instruction sequences
TLDR
A novel idea of automatically identifying critical instruction sequences that can classify between malicious and clean programs using data mining techniques is presented, formulated as a binary classification problem and built logistic regression, neural networks and decision tree models.
Unknown Malicious Executable Defection
  • Yingxu Lai
  • Computer Science
    2008 Eighth International Conference on Intelligent Systems Design and Applications
  • 2008
TLDR
To improve the performance of the Bayesian classifier, a novel algorithm called half increment naive Bayes (HIB) is presented, and it is shown that the classifier yields high detection rates and works at a high learning speed.
Unknown Malicious Executables Detection Based on Immune Principles
TLDR
An immune-based approach for detection of unknown malicious executables is proposed in this paper, which is referred to MEDMI and can use the benign executables to be the training set for building the profile of the system and then generates detectors to detect malicious executable instances.
NewApproach for Detecting Unknown Malicious Executables
TLDR
The preliminary results are promising and justify the use of system calls sequences for the purpose of detection of new malicious executables.
A Feature Selection for Malicious Detection
  • Yingxu Lai
  • Computer Science
    2008 Ninth ACIS International Conference on Software Engineering, Artificial Intelligence, Networking, and Parallel/Distributed Computing
  • 2008
TLDR
It is shown that the proposed method to extract features which are most representative of viral properties, based on strings, achieves high detection rates and can be expected to perform as well in real-world conditions.
Dynamic Detection of Unknown Malicious Executables Base on API Interception
  • Fei Chen, Yan Fu
  • Computer Science
    2009 First International Workshop on Database Technology and Applications
  • 2009
TLDR
This approach extracts signatures of malicious executable's behaviors by using API (Application Program Interface) interception technique which makes possible the detection of unknown malicious executables.
Unknown Malicious Executables Detection Based on Run-Time Behavior
TLDR
This paper proposes a malicious executable detecting method using 35-dimension feature vector, which suggests that the method is efficient in detecting previously unknown malicious executables which have more than two behavior features captured.
A Method for Detecting Unknown Malicious Executables
TLDR
The preliminary results are promising and justify the use of system calls sequences for the purpose of detection of new malicious executables.
...
...

References

SHOWING 1-10 OF 62 REFERENCES
A data mining framework for building intrusion detection models
  • Wenke Lee, S. Stolfo, K. Mok
  • Computer Science
    Proceedings of the 1999 IEEE Symposium on Security and Privacy (Cat. No.99CB36344)
  • 1999
TLDR
A data mining framework for adaptively building Intrusion Detection (ID) models is described, to utilize auditing programs to extract an extensive set of features that describe each network connection or host session, and apply data mining programs to learn rules that accurately capture the behavior of intrusions and normal activities.
Learning Patterns from Unix Process Execution Traces for Intrusion Detection
TLDR
The preliminary experiments to extend the work pioneered by Forrest on learning the (normal abnormal) patterns of Unix processes can be used to identify misuses of and intrusions in Unix systems indicate that machine learning can play an important role by generalizing stored sequence information to perhaps provide broader intrusion detection services.
AUTOMATICALLY GENERATED WIN32 HEURISTIC VIRUS DETECTION
TLDR
This work automatically construct multiple neural network classifiers which can detect unknown Win32 viruses, following a technique described in previous work on boot virus heuristics, by combining the individual classifier outputs using a voting procedure.
Automated assistance for detecting malicious code
TLDR
The MCT is a semi-automated tool that is capable of detecting many types of malicious code, such as viruses, Trojan horses, and time/logic bombs and allows security analysts to check a program before installation, thereby avoiding any damage a malicious program might inflict.
MCF: a malicious code filter
Neural networks for computer virus recognition
TLDR
The article discusses the methods for handling several challenges in taking the neural network from a research idea to a commercial product, including designing an appropriate input representation scheme; dealing with the scarcity of available training data; and making the software conform to strict constraints on memory and speed of computation needed to run on PCs.
Anatomy of a Commercial-Grade Immune System
TLDR
The first commercial-grade immune system that can find, analyze and cure previously unknown viruses faster than the viruses themselves can spread is built, and end-to-end security of the system allows the safe submission of virus samples and ensures authentication of new virus definitions.
The internet worm program: an analysis
TLDR
The paper contains a review of the security flaws exploited by the worm program, and gives some recommendations on how to eliminate or mitigate their future use.
Open Problems in Computer Virus Research
TLDR
This paper examines several open research problems in the area of protection from computer viruses, and suggests possible approaches to deal with these problems.
Learning to Classify Text from Labeled and Unlabeled Documents
TLDR
It is shown that the accuracy of text classifiers trained with a small number of labeled documents can be improved by augmenting this small training set with a large pool of unlabeled documents, and an algorithm is introduced based on the combination of Expectation-Maximization with a naive Bayes classifier.
...
...