Back to Search Start Over

Dielectric Ceramics Database Automatically Constructed by Data Mining in the Literature.

Authors :
Wang X
Zhang W
Zhang W
Source :
Journal of chemical information and modeling [J Chem Inf Model] 2024 Aug 12; Vol. 64 (15), pp. 5931-5943. Date of Electronic Publication: 2024 Jul 23.
Publication Year :
2024

Abstract

Vast published dielectric ceramics literature is a natural database for big-data analysis, discovering structure-property relationships, and property prediction. We constructed a data-mining pipeline based on natural language processing (NLP) to extract property information from about 12,900 published dielectric ceramics articles and normalized more than 20 properties. The micro-F1 scores for sentence classification, named entities recognition, relation extraction (related), and relation extraction (same), are 91.6, 82.4, 91.4, and 88.3%, respectively. We demonstrated the distribution of some essential properties according to the publication years to reveal the tendency. In order to test the reliability of the data extraction, we trained an XGBoost model to predict the dielectric constant and used the SHAP module to interpret the contribution of each feature in order to identify some of the factors that determine the dielectric properties. The result shows that including Q × f in the model can increase the dielectric constant prediction accuracy. Our work can give some hints to experimentalists on their way to improve the performances of cutting-edge materials.

Details

Language :
English
ISSN :
1549-960X
Volume :
64
Issue :
15
Database :
MEDLINE
Journal :
Journal of chemical information and modeling
Publication Type :
Academic Journal
Accession number :
39042485
Full Text :
https://doi.org/10.1021/acs.jcim.4c00282