Back to Search Start Over

Compiler and optimization level recognition using graph neural networks

Authors :
Bardin, Sébastien
Benoit, Tristan
Marion, Jean-Yves
CEA- Saclay (CEA)
Commissariat à l'énergie atomique et aux énergies alternatives (CEA)
Carbone (CARBONE)
Department of Formal Methods (LORIA - FM)
Laboratoire Lorrain de Recherche en Informatique et ses Applications (LORIA)
Institut National de Recherche en Informatique et en Automatique (Inria)-Université de Lorraine (UL)-Centre National de la Recherche Scientifique (CNRS)-Institut National de Recherche en Informatique et en Automatique (Inria)-Université de Lorraine (UL)-Centre National de la Recherche Scientifique (CNRS)-Laboratoire Lorrain de Recherche en Informatique et ses Applications (LORIA)
Institut National de Recherche en Informatique et en Automatique (Inria)-Université de Lorraine (UL)-Centre National de la Recherche Scientifique (CNRS)-Institut National de Recherche en Informatique et en Automatique (Inria)-Université de Lorraine (UL)-Centre National de la Recherche Scientifique (CNRS)
This work is supported by a public grant overseen by the French National Research Agency (ANR) as part of the 'Investissements d’Avenir' French PIA project 'Lorraine Université d’Excellence', reference ANR-15-IDEX-04-LUE. Experiments presented in this paper were carried out using the Grid’5000 testbed, supported by a scientific interest group hosted by Inria and including CNRS, RENATER and several Universities as well as other organizations (see https://www.grid5000.fr).
GRID5000
IMPACT-DIGITRUST
ANR-15-IDEX-0004,LUE,Isite LUE(2015)
Benoit, Tristan
ISITE - Isite LUE - - LUE2015 - ANR-15-IDEX-0004 - IDEX - VALID
Source :
MLPA 2020-Machine Learning for Program Analysis, MLPA 2020-Machine Learning for Program Analysis, Jan 2021, Yokohama / Virtual, Japan
Publication Year :
2021
Publisher :
HAL CCSD, 2021.

Abstract

The main objective of this workshop is to bring together researchers in the machine learning and program analysis communities and to serve as a platform for identifying cross-disciplinary problems of mutual interest.; International audience; We consider the problem of recovering the compiling chain used to generate a given bare binary code. We present a first attempt to devise a Graph Neural Network framework to solve this problem, in order to take into account the shallow semantics provided by the binary code's structured control flow graph (CFG). We introduce a Graph Neural Network, called Site Neural Network (SNN), dedicated to this problem. Feature extraction is simplified by forgetting almost everything in a CFG except transfer control instructions. While at an early stage, our experiments show that our method already recovers the compiler and the optimization level provenance with very high accuracy. We believe these are promising results that may offer new, more robust leads for compiling tool chain identification.

Details

Language :
English
Database :
OpenAIRE
Journal :
MLPA 2020-Machine Learning for Program Analysis, MLPA 2020-Machine Learning for Program Analysis, Jan 2021, Yokohama / Virtual, Japan
Accession number :
edsair.dedup.wf.001..287b3df6949f2f6dbfe78a42dc53e1f8