Charles Nicholas

Learn More
Overview Feature selection is a basic step in the construction of a vector space or bag of words model [BB99]. In particular, when the processing task is to partition a given document collection into clusters of similar documents a choice of good features along with good clustering algorithms is of paramount importance. This chapter suggests two techniques(More)
Distributed Intrusion Detection Systems (DIDS) offer an alternative to centralized intrusion detection. Current research indicates that a distributed intrusion detection paradigm may afford greater coverage, consequently providing an increase in security. In some cases, DIDS offer an alternative to centralized analysis, consequently improving scalabity.(More)
We describe an implementation and experiments with a low-distortion randomized projection algorithm [LINI94] that can reduce the number of dimensions in the data by a considerable amount. The performance of the randomized algorithm is compared with that of a popular technique-Principal Component Analysis (PCA). The experiments show that the randomized(More)
Malware classification using machine learning algorithms is a difficult task, in part due to the absence of strong natural features in raw executable binary files. Byte n-grams previously have been used as features, but little work has been done to explain their performance or to understand what concepts are actually being learned. In contrast to other work(More)
In the past few years, the explosive g r o wth of the Internet has allowed the construction of "virtual" systems containing hundreds or thousands of individual , relatively inexpensive computers. The agent paradigm is well-suited for this environment because it is based on distributed autonomous computation. Although the deenition of a software agent v(More)
  • 1