Back to Search Start Over

Fast reused code tracing method based on simhash and inverted index

Authors :
Yan-chen QIAO
Xiao-chun YUN
Yu-peng TUO
Yong-zheng ZHANG
Source :
Tongxin xuebao, Vol 37, Pp 104-113 (2016)
Publication Year :
2016
Publisher :
Editorial Department of Journal on Communications, 2016.

Abstract

A novel method for fast and accurately tracing reused code was proposed. Based on simhash and inverted in-dex, the method can fast trace similar functions in massive code. First of all, a code database with three-level inverted in-dex structures was constructed. For the function to be traced, similar code blocks could be found quickly according to simhash value of the code block in the function code. Then the potential similar functions could be fast traced using in-verted index. Finally, really similar functions could be identified by comparing jump relationships of similar code blocks. Further, malware samples containing similar functions could be traced. The experimental results show that the method can quickly identify the functions inserted by compilers and the reused functions based on the code database under the premise of high accuracy and recall rate.

Details

Language :
Chinese
ISSN :
1000436X
Volume :
37
Database :
Directory of Open Access Journals
Journal :
Tongxin xuebao
Publication Type :
Academic Journal
Accession number :
edsdoj.7c58aba5ead7432997a82eee93414e1c
Document Type :
article
Full Text :
https://doi.org/10.11959/j.issn.1000-436x.2016225