Back to Search Start Over

Study on Malware Classification Based on N-Gram Static Analysis Technology

Authors :
ZHANG Guang-hua, GAO Tian-jiao, CHEN Zhen-guo, YU Nai-wen
Source :
Jisuanji kexue, Vol 49, Iss 8, Pp 336-343 (2022)
Publication Year :
2022
Publisher :
Editorial office of Computer Science, 2022.

Abstract

In order to solve the problem of low accuracy of malware classification,this paper proposes a research on malware classification based on N-Gram static analysis technology.Firstly,the N-Gram method is used to extract the byte sequence of length 2 from the malware samples.Secondly,according to the extracted features,KNN,logistic regression,random forest and XGBoost are used to train the malware classification model based on machine learning.Thirdly,the confusion matrix and logarithmic loss function are used to evaluate the malware classification model.Finally,the malware classification model is trained and tested in the Kaggle malware data set.Experimental results show that the accuracy rates of the malware classification models of XGBoost and random forest reach 98.43% and 97.93%,and the Log Loss values are 0.022240 and 0.026946,respectively.Compared with the existing methods,the proposed method can classify malware more accurately and protect computer system from malware attack.

Details

Language :
Chinese
ISSN :
1002137X
Volume :
49
Issue :
8
Database :
Directory of Open Access Journals
Journal :
Jisuanji kexue
Publication Type :
Academic Journal
Accession number :
edsdoj.2fbe8852f24e2a9f64ff2eb9304d61
Document Type :
article
Full Text :
https://doi.org/10.11896/jsjkx.210900203