1. Optimized Preprocessing and Machine Learning for Quantitative Raman Spectroscopy in Biology
- Author
-
Storey, Emily E, Helmy, Amr S., Storey, Emily E, and Helmy, Amr S.
- Abstract
Raman spectroscopy's capability to provide meaningful composition predictions is heavily reliant on a pre-processing step to remove insignificant spectral variation. This is crucial in biofluid analysis. Widespread adoption of diagnostics using Raman requires a robust model which can withstand routine spectra discrepancies due to unavoidable variations such as age, diet, and medical background. A wealth of pre-processing methods are available, and it is often up to trial-and-error or user experience to select the method which gives the best results. This process can be incredibly time consuming and inconsistent for multiple operators. In this study we detail a method to analyze the statistical variability within a set of training spectra and determine suitability to form a robust model. This allows us to selectively qualify or exclude a pre-processing method, predetermine robustness, and simultaneously identify the number of components which will form the best predictive model. We demonstrate the ability of this technique to improve predictive models of two artificial biological fluids. Raman spectroscopy is ideal for noninvasive, nondestructive analysis. Routine health monitoring which maximizes comfort is increasingly crucial, particularly in epidemic-level diabetes diagnoses. High variability in spectra of biological samples can hinder Raman's adoption for these methods. Our technique allows the decision of optimal pre-treatment method to be determined for the operator; model performance is no longer a function of user experience. We foresee this statistical technique being an instrumental element to widening the adoption of Raman as a monitoring tool in a field of biofluid analysis.
- Published
- 2019