1. AndroDFA: Android Malware Classification Based on Resource Consumption
- Author
-
Daniele Ucci, Leonardo Querzoni, Leonardo Aniello, Luca Massarelli, Roberto Baldoni, and Claudio Ciccotelli
- Subjects
Computer science ,0211 other engineering and technologies ,02 engineering and technology ,computer.software_genre ,Mobile malware ,malware ,machine learning ,android ,Software ,Android ,020204 information systems ,0202 electrical engineering, electronic engineering, information engineering ,Android (operating system) ,Malware analysis ,021110 strategic, defence & security studies ,procfs ,lcsh:T58.5-58.64 ,business.industry ,lcsh:Information technology ,Support vector machine ,Detrended fluctuation analysis ,Malware ,Data mining ,business ,computer ,Information Systems - Abstract
The vast majority of today&rsquo, s mobile malware targets Android devices. An important task of malware analysis is the classification of malicious samples into known families. In this paper, we propose AndroDFA (DFA, detrended fluctuation analysis): an approach to Android malware family classification based on dynamic analysis of resource consumption metrics available from the proc file system. These metrics can be easily measured during sample execution. From each malware, we extract features through detrended fluctuation analysis (DFA) and Pearson&rsquo, s correlation, then a support vector machine is employed to classify malware into families. We provide an experimental evaluation based on malware samples from two datasets, namely Drebin and AMD. With the Drebin dataset, we obtained a classification accuracy of 82%, comparable with works from the state-of-the-art like DroidScribe. However, compared to DroidScribe, our approach is easier to reproduce because it is based on publicly available tools only, does not require any modification to the emulated environment or Android OS, and by design, can also be used on physical devices rather than exclusively on emulators. The latter is a key factor because modern mobile malware can detect the emulated environment and hide its malicious behavior. The experiments on the AMD dataset gave similar results, with an overall mean accuracy of 78%. Furthermore, we made the software we developed publicly available, to ease the reproducibility of our results.
- Published
- 2020