1. Breast cancer prediction based on gene expression data using interpretable machine learning techniques.
- Author
-
Kallah-Dagadu G, Mohammed M, Nasejje JB, Mchunu NN, Twabi HS, Batidzirai JM, Singini GC, Nevhungoni P, and Maposa I
- Subjects
- Humans, Female, Gene Expression Profiling methods, Gene Expression Regulation, Neoplastic, Biomarkers, Tumor genetics, Breast Neoplasms genetics, Machine Learning, Support Vector Machine
- Abstract
Breast cancer remains a global health burden, with an increase in deaths related to this particular cancer. Accurately predicting and diagnosing breast cancer is important for treatment development and survival of patients. This study aimed to accurately predict breast cancer using a dataset comprising 1208 observations and 3602 genes. The study employed feature selection techniques to identify the most influential predictive genes for breast cancer using machine learning (ML) models. The study used K-nearest Neighbors (KNN), random forests (RF), and a support vector machine (SVM). Furthermore, the study employed feature- and model-based importance and explainable ML methods, including Shapley values, Partial dependency (PDPS), and Accumulated Local Effects (ALE) plots, to explain the genes' importance ranking from the ML methods. Shapley values highlighted the significance of some of the genes in predicting cancer presence. Model-based feature ranking techniques, particularly the Leaving-One-Covariate-In (LOCI) method, identified the ten most critical genes for predicting tumor cases. The LOCI rankings from the SVM and RF methods were aligned. Additionally, visualization methods such as PDPS and ALE plots demonstrated how individual feature changes affect predictions and interactions with other genes. By combining feature selection techniques and explainable ML methods, this study has demonstrated the interpretability and reliability of machine learning models for breast cancer prediction, emphasizing the importance of incorporating explainable ML approaches for medical decision-making., Competing Interests: Declarations. Competing interests: The authors declare that there are no competing interests. Ethics approval: There is no ethical approval required., (© 2025. The Author(s).)
- Published
- 2025
- Full Text
- View/download PDF