• Corpus ID: 8212980

Selecting Features for Intrusion Detection: A Feature Relevance Analysis on KDD 99

@inproceedings{Kayacik2005SelectingFF,
  title={Selecting Features for Intrusion Detection: A Feature Relevance Analysis on KDD 99},
  author={Hilmi G{\"u}nes Kayacik and Ayse Nur Zincir-Heywood and Malcolm I. Heywood},
  booktitle={PST},
  year={2005}
}
KDD 99 intrusion detection datasets, which are based on DARPA 98 dataset, provides labeled data for researchers working in the field of intrusion detection and is the only labeled dataset publicly available. Numerous researchers employed the datasets in KDD 99 intrusion detection competition to study the utilization of machine learning for intrusion detection and reported detection rates up to 91% with false positive rates less than 1%. To substantiate the performance of machine learning based… 

Figures and Tables from this paper

Intrusion Detection System Classification Using Different Machine Learning Algorithms on KDD-99 and NSL-KDD Datasets - A Review Paper
TLDR
An experiment is carried out to evaluate the performance of the different machine learning algorithms using KDD-99 Cup and NSL-KDD datasets, and results show which approach has performed better in terms of accuracy, detection rate with reasonable false alarm rate.
Relevance Features Selection for Intrusion Detection
TLDR
Rough set degree of dependency and dependency ratio of each class were employed to determine the most discriminating features for each class and empirical results show that seven features were not relevant in the detection of any class.
Analysis of KDD '99 Intrusion Detection Dataset for Selection of Relevance Features
TLDR
Rough set degree of dependency and dependency ratio of each class were employed to determine the most discriminating features for each class and empirical results show that seven features were not relevant in the detection of any class.
Feature Grouping for Intrusion Detection Based on Mutual Information
TLDR
A feature grouping method for the selection of features for intrusion detection based on mutual information theory and tested against KDD CUP 99 dataset shows that better classification performance results from such selected features are shown.
Intrusion Detection System on KDDCup 99 Dataset : A Survey
TLDR
A survey of all the techniques implemented for the discovery and categorization of intrusions on KDDC up 99 dataset is discussed, so that by identifying their various issues a new and efficient technique is implemented which can classify and detection intrusions in KDDCup 99 dataset.
Identifying important characteristics in the KDD99 intrusion detection dataset by feature selection using a hybrid approach
TLDR
This paper proposes a technique for selecting relevant features out of KDD99 using a hybrid approach toward an optimal subset of features that efficiently identifies which sort of attack each register in the dataset refers to.
Discriminant Analysis based Feature Selection in KDD Intrusion Dataset
TLDR
Important features of KDD Cup „99 attack dataset are obtained using discriminant analysis method and used for classification of attacks and the results show that classification is done with minimum error rate with the reduced feature set.
An Effective Feature Selection Approach for Network Intrusion Detection
  • Fengli ZhangDan Wang
  • Computer Science
    2013 IEEE Eighth International Conference on Networking, Architecture and Storage
  • 2013
TLDR
An effective feature selection approach based on Bayesian Network classifier is proposed and with the same intrusion detection benchmark dataset (NSL-KDD), the performance of the proposal is evaluated and compared with other commonly used feature selection methods.
Effective Value of Decision Tree with KDD 99 Intrusion Detection Datasets for Intrusion Detection System
TLDR
The effective value of the decision tree as the data mining method for the IDSs, and the DARPA Set as the learning data set for the decision trees are evaluated.
An Approach for Automatic Selection of Relevance Features in Intrusion Detection Systems
TLDR
This paper proposes a probability distribution based approach that extract appropriate information from the intrusion data and supplies that information to the RST implementation so that the relevance features can be selected automatically.
...
...

References

SHOWING 1-10 OF 13 REFERENCES
On the capability of an SOM based intrusion detection system
TLDR
An approach to network intrusion detection is investigated, based purely on a hierarchy of Self-Organizing Feature Maps, which is capable of detection (false positive) rates of 89% and is at least as good as the alternative data-mining approaches that require all 41 features.
Why machine learning algorithms fail in misuse detection on KDD intrusion detection data set
TLDR
Analysis results clearly suggest that no pattern classification or machine learning algorithm can be trained successfully with the KDD data set to perform misuse detection for user-to-root or remote- to-local attack categories.
Analysis of Three Intrusion Detection System Benchmark Datasets Using Machine Learning Algorithms
TLDR
The main objective of the analysis is to determine the differences between synthetic and real-world traffic, however the analysis methodology detailed in this paper can be employed for general network analysis purposes.
Testing Intrusion detection systems: a critique of the 1998 and 1999 DARPA intrusion detection system evaluations as performed by Lincoln Laboratory
TLDR
The purpose of this article is to attempt to identify the shortcomings of the Lincoln Lab effort in the hope that future efforts of this kind will be placed on a sounder footing.
KDD-99 classifier learning contest LLSoft's results overview
TLDR
The Kernel Miner's approach and method used for solving the contest task is described and the received results are analyzed and explained.
Bro: a system for detecting network intruders in real-time
  • V. Paxson
  • Computer Science
    Comput. Networks
  • 1998
Data Mining: Concepts and Techniques
TLDR
This book presents dozens of algorithms and implementation examples, all in pseudo-code and suitable for use in real-world, large-scale data mining projects, and provides a comprehensive, practical look at the concepts and techniques you need to get the most out of real business data.
Winning the KDD99 classification cup: bagged boosting
TLDR
The standard sampling with replacement methodology of bagging was modified to put a specific focus on the smaller but expensive-if-predicted-wrongly classes.
Synthesizing Statistical Knowledge from Incomplete Mixed-Mode Data
  • A. WongD. Chiu
  • Computer Science
    IEEE Transactions on Pattern Analysis and Machine Intelligence
  • 1987
TLDR
The proposed method adopts an event-covering approach which covers a subset of statistically relevant outcomes in the outcome space of variable-pairs and can acquire statistical knowledge from incomplete mixed-mode data.
Task Description
  • Thomas Lindner
  • Computer Science
    Formal Development of Reactive Systems
  • 1995
...
...