1. Data pre-processing for paper-based colorimetric sensor arrays.
- Author
-
Hemmateenejad, Bahram and Baumann, Knut
- Subjects
- *
SENSOR arrays , *IMAGE sensors , *VOLATILE organic compounds , *DISCRIMINANT analysis , *REGRESSION analysis - Abstract
The responses of the paper-based colorimetric sensor arrays are typically recorded by an imaging device. The color values of the images are subjected to chemometrics data analysis, with a view to extract the relevant information. As is the case with data extracted from other analytical instruments, these data must undergo pre-processing prior to undergoing further analysis. This study represents the first comprehensive and systematic investigation into the impact of data pre-processing techniques on the quality of subsequent data analysis methods applied to imaging data collected from paper-based colorimetric sensor arrays. The use of color difference data (calculated by subtracting the images of the sensors before exposure from those after exposure) revealed that pre-treatment of the data was not a critical factor, although it could reduce the complexity of the model. For example, the number of principal components in the principal component-linear discriminant analysis model was reduced from eight (for data that had not been pre-processed) to three (for pre-processed data) to achieve the same level of accuracy (92 %). Nevertheless, the pivotal role of data pre-processing was elucidated through the examination of data sets collected immediately following exposure to the samples' vapor. It was demonstrated that the use of an appropriate pre-processing method allows for the elimination or significant reduction of between-sensor variations, obviating the necessity for the inclusion of data from images taken prior to exposure. With regard to the objective of classification, the object pre-processing methods that demonstrated particular promise were mean (or median) centering, Pareto scaling and standard normal variate. To illustrate, in the analysis of volatile organic compounds by an array of metallic nanoparticles, the cross-validation classification accuracy of the unprocessed data, which was 70 %, increased to 95 % when unit variance scaling and range scaling were applied to objects and variables, respectively. In the calibration phase, the majority of pre-processing methods enhanced the quality of the regression models. Using suitable pre-processing methods for both objects and variables, eliminated the need for using the before exposing image of the CSAs. • Pre-processing can eliminate or significantly decrease the between sensor variations. • Mean-centering, Pareto scaling and standard normal variate showed to be more promising. • By choosing a suitable pre-processing, the need for capturing before exposing image is eliminated. • For classification, object scaling and for regression analysis, variable scaling played a more significant role. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF