Dynamic Android Malware Classification Using Graph-Based Representations

Abstract

Malware classification for the Android ecosystem can be performed using a range of techniques. One major technique that has been gaining ground recently is dynamic analysis based on system call invocations recorded during the executions of Android applications. Dynamic analysis has traditionally been based on converting system calls into flat feature vectors and feeding the vectors into machine learning algorithms for classification. In this paper, we implement three traditional feature-vector-based representations for Android system calls. For each feature vector representation, we also propose a novel graph-based representation. We then use graph kernels to compute pair-wise similarities and feed these similarity measures into a Support Vector Machine (SVM) for classification. To speed up the graph kernel computation, we compress the graphs using the Compressed Row Storage format, and then we apply OpenMP to parallelize the computation. Experiments show that the graph-based representations are able to improve the classification accuracy over the corresponding feature-vector-based representations from the same input. Finally we show that different representations can be combined together to further improve classification accuracy.

DOI: 10.1109/CSCloud.2016.27

11 Figures and Tables

Cite this paper

@article{Xu2016DynamicAM, title={Dynamic Android Malware Classification Using Graph-Based Representations}, author={Lifan Xu and Dong Ping Zhang and Marco A. Alvarez and Jose Andre Morales and Xudong Ma and John Cavazos}, journal={2016 IEEE 3rd International Conference on Cyber Security and Cloud Computing (CSCloud)}, year={2016}, pages={220-231} }