Back to Search Start Over

Combining genetic algorithm and compressed sensing for features and operators selection in symbolic regression

Authors :
Mazheika, Aliaksei
Levchenko, Sergey V.
Ghiringhelli, Luca M.
Publication Year :
2024

Abstract

Symbolic-inference methods have recently found a broad application in materials science. In particular, the Sure-Independence Screening and Sparsifying Operator (SISSO) performs symbolic regression and classification by adopting compressed sensing for the selection of an optimized subset of features and mathematical operators out of a given set of candidates. However, SISSO becomes computationally unpractical when the set of candidate features and operators exceeds the size of few tens. In the present work, we combine SISSO with a genetic algorithm (GA) for the global search of the optimal subset of features and operators. We demonstrate that GA-SISSO efficiently finds more accurate predictive models than the original SISSO, due to the possibility to access a larger input feature and operator space. GA-SISSO was applied for the search of the model for the prediction of carbon-dioxide adsorption energies on semiconductor oxides. The obtained with GA-SISSO model has much higher accuracy compared to models previously discussed in the literature (based solely on the O 2p-band center). The analysis of features importance shows that, besides the O 2p-band center, the contribution of the electrostatic potential above adsorption sites and the surface formation energies are also important.

Details

Database :
arXiv
Publication Type :
Report
Accession number :
edsarx.2403.15816
Document Type :
Working Paper