1. Malware detection in mobile environments based on Autoencoders and API-images
- Author
-
Francesco Palmieri, Massimo Ficco, Gianni D'Angelo, D'Angelo, G., Ficco, M., and Palmieri, F.
- Subjects
Computer Networks and Communications ,Computer science ,Autoencoders ,02 engineering and technology ,computer.software_genre ,Machine learning ,Dynamic analysi ,Malware ,Theoretical Computer Science ,Android ,Artificial Intelligence ,Dynamic analysis ,0202 electrical engineering, electronic engineering, information engineering ,Android (operating system) ,Artificial neural network ,business.industry ,Deep learning ,020206 networking & telecommunications ,Autoencoder ,Hardware and Architecture ,020201 artificial intelligence & image processing ,Artificial intelligence ,business ,computer ,Classifier (UML) ,Software - Abstract
Due to their open nature and popularity, Android-based devices represent one of the main targets for malware attacks that may adversely affect the privacy of their users. Considering the huge Android market share, it is necessary to build effective tools able to reliably detect zero-day malware on these platforms. Therefore, several static and dynamic analysis methods based on Neural Networks and Deep Learning have been proposed in the literature. Despite machine learning can be considered the most promising approach for classifying applications into malware or legitimate ones, its success strongly depends on the choice of the right features used for building the detection model. This is definitely not an easy task that requires a systematic solution. Accordingly, this work represents the sequences of API calls invoked by apps during their execution as sparse matrices looking like images (API-images), which can be used as fingerprints of the apps’ behavior over time. We also used autoencoders to autonomously extract the most representative and discriminating features from these matrices, that, once provided to an artificial neural network-based classifier have shown to be effective in detecting malware, also when the network is trained on a reduced number of samples. Experimental results show that the resulting framework is able to outperform more complex and sophisticated machine learning approaches in malware classification.
- Published
- 2020