Back to Search Start Over

A Robust TabNet-Based Multi-Classification Algorithm for Infrared Spectral Data of Chinese Herbal Medicine with High-Dimensional Small Samples.

Authors :
Wang Y
Jin C
Ma L
Liu X
Source :
Journal of pharmaceutical and biomedical analysis [J Pharm Biomed Anal] 2024 May 15; Vol. 242, pp. 116031. Date of Electronic Publication: 2024 Feb 14.
Publication Year :
2024

Abstract

Robust classification algorithms for high-dimensional, small-sample datasets are valuable in practical applications. Faced with the infrared spectroscopic dataset with 568 samples and 3448 wavelengths (features) to identify the origins of Chinese medicinal materials, this paper proposed a novel embedded multiclassification algorithm, ITabNet, derived from the framework of TabNet. Firstly, a refined data pre-processing (DP) mechanism was designed to efficiently find the best adaptive one among 50 DP methods with the help of Support Vector Machine (SVM). Following this, an innovative focal loss function was designed and joined with a cross-validation experiment strategy to mitigate the impact of sample imbalance on algorithm. Detailed investigations on ITabNet were conducted, including comparisons of ITabNet with SVM for the conditions of DP and Non-DP, GPU and CPU computer settings, as well as ITabNet against XGBT (Extreme Gradient Boosting). The numerical results demonstrate that ITabNet can significantly improve the effectiveness of prediction. The best accuracy score is 1.0000, and the best Area Under the Curve (AUC) score is 1.0000. Suggestions on how to use models effectively were given. Furthermore, ITabNet shows the potential to apply the analysis of medicinal efficacy and chemical composition of medicinal materials. The paper also provides ideas for multi-classification modeling data with small sample size and high-dimensional feature.<br />Competing Interests: Declaration of Competing Interest The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.<br /> (Copyright © 2024 Elsevier B.V. All rights reserved.)

Details

Language :
English
ISSN :
1873-264X
Volume :
242
Database :
MEDLINE
Journal :
Journal of pharmaceutical and biomedical analysis
Publication Type :
Academic Journal
Accession number :
38382317
Full Text :
https://doi.org/10.1016/j.jpba.2024.116031