1. A two-dimensional sample screening method based on data quality and variable correlation.
- Author
-
Li, Gang, Wang, Dan, Wang, Kang, and Lin, Ling
- Subjects
- *
DATA quality , *SAMPLING methods , *SPECTRUM analysis , *DATA integrity , *REGRESSION analysis - Abstract
The selection of a training set is the key to determining the quality of the model. In the spectrum analysis, due to various interference factors, the quality of the collected spectral data of some samples has a serious deviation. If directly used in modeling, it will introduce bias to the establishment of the model. Therefore, to get the most representative samples, it is necessary to select samples before establishing the model. This paper proposes a two-dimensional sample selection (TDSS) method, which selects samples from two angles of spectral data quality and variable correlation. This method and Mahalanobis distance method were respectively applied to dynamic spectrum (DS) data to screen samples. The samples screened by the two methods were used for modeling. Finally, establish partial least squares (PLS) linear regression model with a quadratic nonlinear correction method to predict the target components. The experimental results show that the sample screening method significantly improved the accuracy and prediction performance of the model, and it is better than the Mahalanobis distance method. In the prediction of triglyceride and total cholesterol, the correlation coefficient can reach above 0.82. The experimental results fully prove the effectiveness of the sample selection method in this paper, and it has a remarkable effect on improving the accuracy and robustness of the model. This paper provides a new way for sample selection of modeling set in spectral analysis of complex solutions. [Display omitted] • Propose a new training set sample screening method. • Consider two dimensions: sample data quality and variable correlation. • Based on statistical and dynamic spectrum data. • Improve the accuracy of non-invasive blood component detection. • This method is a universal method for sample screening in spectral analysis. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF