Learn More
We propose a method to analyze files to categorize their type using efficient 1-gram analysis of their binary contents. Our aim is to be able to accurately identify the true type of an arbitrary file using statistical analysis of their binary contents without parsing. Consequently, we may determine the type of a file if its name does not announce its true(More)
The Email Mining Toolkit (EMT) is a data mining system that computes <i>behavior profiles or models</i> of user email accounts. These models may be used for a multitude of tasks including forensic analyses and detection tasks of value to law enforcement and intelligence agencies, as well for as other typical tasks such as virus and spam detection. To(More)
The Email Mining Toolkit (EMT) is a data mining system that computes behavior profiles or models of user email accounts. These models may be used for a variety of forensic analyses and detection tasks. In this paper we focus on the application of these models to detect the early onset of a viral propagation without "contentbased" (or signature-based)(More)
By exploiting the object-oriented dynamic composability of modern document applications and formats, malcode hidden in otherwise inconspicuous documents can reach third-party applications that may harbor exploitable vulnerabilities that are otherwise unreachable by network-level service attacks. Such attacks can be very selective and difficult to detect(More)
The analysis of the vast storehouse of email content accumulated or produced by individual users has received relatively little attention other than for specific tasks such as spam and virus filtering. Current email analysis in standard client applications consists of keyword based matching techniques for filtering and expert driven manual exploration of(More)
We introduce the Email Mining Toolkit (EMT), a system that implements behavior-based methods to improve security of email systems. Behavior models of email flows and email account usage may be used for a variety of detection tasks. Behavior-based models are quite different from "content-based" models in common use today, such as virus scanners. We evaluate(More)