Back to Search Start Over

A Novel Machine Learning Approach for Android Malware Detection Based on the Co-Existence of Features

Authors :
Esraa Odat
Qussai M. Yaseen
Source :
IEEE Access, Vol 11, Pp 15471-15484 (2023)
Publication Year :
2023
Publisher :
IEEE, 2023.

Abstract

This paper proposes a machine learning model based on the co-existence of static features for Android malware detection. The proposed model assumes that Android malware requests an abnormal set of co-existed permissions and APIs in comparing to those requested by benign applications. To prove this assumption, the paper created a new dataset of co-existed permissions and API calls at different levels of combinations, which are the second level, the third level, the fourth level and the fifth level. The extracted datasets of co-existed features at different levels were applied on permissions only, APIs only, permissions and APIs, and APIs and APIs frequencies. To extract the most relevant co-existed features, the frequent pattern growth (FP-growth) algorithm, which is an association rule mining technique, was used. The new datasets were extracted using Android APK samples from the Drebin, Malgenome and MalDroid2020 datasets. To evaluate the proposed model, several conventional machine learning algorithms were used. The results show that the model can successfully classify Android malware with a high accuracy using machine learning algorithms and the co-existence of features. Moreover, the results show that the achieved classification accuracy depends on the classifier and the type of co-existed features. The maximum accuracy, which is 98%, was achieved using the Random Forest algorithm and the co-existence of permissions features at the second combination level. Furthermore, the results show that the proposed approach outperforms the state-of-the-art model. Using Malgenome dataset, the proposed approach achieved an accuracy of about 98%, while the state-of-the-art achieved an accuracy of about 87%. In addition, the experiments show that using the Drebin dataset, the proposed approach achieved an accuracy of about 95%, while the state-of-the-art achieved an accuracy of about 93%.

Details

Language :
English
ISSN :
21693536
Volume :
11
Database :
Directory of Open Access Journals
Journal :
IEEE Access
Publication Type :
Academic Journal
Accession number :
edsdoj.29f7debbd43b4736aa9cfd68010abe73
Document Type :
article
Full Text :
https://doi.org/10.1109/ACCESS.2023.3244656