Visualization of shared system call sequence relationships in large malware corpora

  title={Visualization of shared system call sequence relationships in large malware corpora},
  author={Joshua Saxe and David Mentis and Christopher Greamo},
  booktitle={VizSec '12},
We present a novel system for automatically discovering and interactively visualizing shared system call sequence relationships within large malware datasets. [...] Key Method Then, based on the occurrence of these semantic sequences, we construct a Boolean vector representation of the malware sample corpus. Finally we compute Jaccard indices pairwise over sample vectors to obtain a sample similarity matrix. Our graphical user interface links two visualizations within an interactive display.Expand
Malware Similarity Identification Using Call Graph Based System Call Subsequence Features
A novel code-sharing analysis technique that can complement existing methods that incorporates sequence information into the features it uses to perform similarity analysis, but unlike previously proposed longest common substring methods it runs in linear time.
SEEM: a scalable visualization for comparing multiple large sets of attributes for malware analysis
The Similarity Evidence Explorer for Malware (SEEM) is developed, a scalable visualization tool for simultaneously comparing a large corpus of malware across multiple sets of attributes (such as the sets of printable strings and function calls).
Detecting malware samples with similar image sets
A scalable and intuitive method for computing similarity measurements between malware based on the visual similarity of their sets of images and a visualization method that combines a force-directed graph layout with a set visualization technique so as to highlight visual similarity relationships in malware corpora is given.
A Survey of Visualization Systems for Malware Analysis
A systematic overview and categorization of malware visualization systems from the perspective of visual analytics is provided, which helps to reveal data types that are currently underrepresented, enabling new research opportunities in the visualization community.
Mining Web Technical Discussions to Identify Malware Capabilities
Preliminary results demonstrate the viability of a new research direction for malware capability identification that addresses the concept of mining web technical documentation to automatically identify malware capabilities, and argue that these early findings demonstrate the promise of a web technical document based approach to automating malware capability Identification.
Malware Analysis Using Visualized Image Matrices
A novel malware visual analysis method that contains not only a visualization method to convert binary files into images, but also a similarity calculation method between these images that generates RGB-colored pixels on image matrices using the opcode sequences extracted from malware samples and calculates the similarities for the imageMatrices.
Malware Visualization for Fine-Grained Classification
This paper takes a new approach to visualize malware as RGB-colored images and extract global features from the images to achieve fast and effective fine-grained classification of malware.
Literature Review in Visual Analytics for Malware Pattern Analysis
It is shown that the scope of malware analysis in combination with VA is still not very well explored, and many of the described approaches use only few interaction techniques and leave a lot of room for future research activities.
Malware visualization methods based on deep convolution neural networks
Two visualization methods for malware analysis based on n-gram features of byte sequences are proposed and their visualized results are learned by the deep convolution networks to extract image features used for classification by SVM (support vector machine).
  • 2019
Recently, there has been a massive increase in number of malware types which poses a severe threat to smart devices and to internet security. Thus, different techniques have been applied to detect,


Visual analysis of malware behavior using treemaps and thread graphs
We study techniques to visualize the behavior of malicious software (malware). Our aim is to help human analysts to quickly assess and classify the nature of a new malware sample. Our techniques are
Visual Reverse Engineering of Binary and Data Files
Design principles for file analysis are presented which support meaningful investigation when there is little or no knowledge of the underlying file format, but are flexible enough to allow integration of additional semantic information, when available.
Visualizing compiled executables for malware analysis
A method using dynamic analysis of program execution to visually represent the overall flow of a program to reduce the amount of time needed to extract key features of an executable, improving productivity.
Introduction to information retrieval
This groundbreaking new textbook teaches web-era information retrieval, including web search and the related areas of text classification and text clustering from basic concepts from a computer science perspective by three leading experts in the field.
AV-Test Statistics Report
  • AV-Test Statistics Report
  • 2012
Microsoft TechNet
  • Microsoft, 1 6 2012. [Online]. Available: [Accessed 1 6 2012].
  • 2012
Stochastic Identification and Clustering of Malware with Dynamic Traces. Malware Technical Exchange Meeting
  • Stochastic Identification and Clustering of Malware with Dynamic Traces. Malware Technical Exchange Meeting
  • 2012
Introduction to Information Retrieval
  • R. Larson
  • Computer Science
    J. Assoc. Inf. Sci. Technol.
  • 2010
Microsoft TechNet Microsoft, 1 6 2012. [Online]. Available: us/sysinternals/bb896645
  • Microsoft TechNet Microsoft, 1 6 2012. [Online]. Available: us/sysinternals/bb896645