Back to Search Start Over

A novel two‐phase near‐infrared and midinfrared wavelength selection framework for sample classification.

Authors :
Fontes, Juliana
Anzanello, Michel J.
Brito, João B. G.
Bucco, Guilherme B.
Source :
Journal of Chemometrics. Mar2024, Vol. 38 Issue 3, p1-17. 17p.
Publication Year :
2024

Abstract

Spectral data describing product samples are typically composed of a large number of noisy and irrelevant wavelengths that tends to undermine the performance of multivariate predictive techniques. This paper proposes a two‐phase framework that integrates a preselection wavelength step oriented by wavelength clustering to a wrapper‐based strategy. The first phase performs a pruning process in the data that removes the less informative wavelengths relying on the spectral clustering, a technique deemed suitable to the Fourier transform infrared (FTIR) spectroscopy and near‐infrared (NIR) spectroscopy data at hand. The preselected wavelengths undergo a second phase of selection efforts based on the combination of different wavelength importance indices (i.e., Bhattacharyya distance, Chi‐square, ReliefF, and Gini) and classification techniques (i.e., support vector machine, k‐nearest neighbors, and random forest). When applied to 11 FTIR datasets from different domains, the recommended combination of importance index and classifier increased the average accuracy by 6.37% (from 0.863 to 0.918), while retaining average 3.84% of the original spectra. The framework also improved the selection process regarding computational time. This paper introduces a two‐phase framework merging wavelength preselection through clustering and a wrapper strategy. Initial spectral clustering eliminates less informative wavelengths. Subsequently, diverse wavelength importance indices and classification methods are integrated. Applied to 11 spectral datasets, the proposed combination (spectral clustering [SC]‐random forest [RF]+Gini [GI]) enhances average accuracy by 6.37%, retaining 3.84% of the original spectra, and reduces computational time. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
08869383
Volume :
38
Issue :
3
Database :
Academic Search Index
Journal :
Journal of Chemometrics
Publication Type :
Academic Journal
Accession number :
175945834
Full Text :
https://doi.org/10.1002/cem.3536