Chun-lei SHANG, Chuan-jun WANG, Wen-yue LIU, De-xin ZHU, Shui-ze WANG, Lin-shuo DONG, Gui-lin WU, Jun-heng GAO, Hai-tao ZHAO, Chao-lei ZHANG, and Hong-hui WU
Pipeline transportation is the most economical means of transporting oil, natural gas, and other energy sources over a long distance. With the increasingly harsh service environment of pipeline transportation, the requirements of pipeline steel in terms of strength, hydrogen-induced fracture resistance, and corrosion resistance have increased. In areas such as plateaus or deep seas, excellent low-temperature toughness is important to ensure the safe transportation of pipeline steel. Drop weight tear testing is one of the most effective methods for measuring the low-temperature toughness of pipeline steel. The test involves large specimens with full wall thickness. Through the characterization of the ductile–brittle shear area and ligament width of the sample, the toughness and tear resistance of pipeline steel can be better reflected. However, the drop weight tear test is difficult, time-consuming, and laborious, and it consumes a large amount of experimental resources. In this work, a machine learning-based model for predicting the drop weight tear test-derived shear area was established according to production line datasets provided by steel mills and pipeline steel datasets collected from the literature. Different machine learning algorithms were tested using the two datasets. The best models were random forest models. Strategy I included only production line datasets, and the Pearson correlation coefficient (PCC), which is the performance index, predicted by the machine learning model was 0.64. Strategy II involved literature data and production line data, and the PCC predicted by the machine learning model was 0.92. The consideration of literature data effectively improved the prediction accuracy of the drop weight tear test shear area. Moreover, in strategy II, to avoid the overfitting of the machine learning model, a feature screening technique was adopted. Finally, a genetic programming-based symbolic regression approach was developed to establish a formula describing the relationship between the selected features and the target shear area data. The PCC of the precision of this formula was 0.83, which indicates that the formula can be used to estimate the drop weight tear test-derived parameters of pipeline steel. The machine learning technology provides a new method for optimizing and predicting the drop weight tear test-derived shear area of pipeline steel. Moreover, the combination of production line data and literature data remarkably improved the accuracy of the machine learning model, which also allows for the prediction of other material production line data via machine learning techniques.