Back to Search Start Over

Prediction of a Multi-Gene Assay (Oncotype DX and Mammaprint) Recurrence Risk Group Using Machine Learning in Estrogen Receptor-Positive, HER2-Negative Breast Cancer—The BRAIN Study.

Authors :
Ji, Jung-Hwan
Ahn, Sung Gwe
Yoo, Youngbum
Park, Shin-Young
Kim, Joo-Heung
Jeong, Ji-Yeong
Park, Seho
Lee, Ilkyun
Source :
Cancers. Feb2024, Vol. 16 Issue 4, p774. 14p.
Publication Year :
2024

Abstract

Simple Summary: Multi-gene assays (MGAs), such as Oncotype DX and Mammaprint, are used to provide predictive and prognostic values in treatment of ER+HER2− breast cancer. However, their accessibility is restricted due to their high cost in some countries. For this reason, many studies have been conducted to develop the tests that can replace the multi-gene assays, but practicality is still insufficient. The aim of our study is to develop a highly accessible machine learning-based model for predicting the result of MGA. Our accurate and affordable machine learning-based predictive model may serve as a cost-effective alternative to the expensive multi-gene assays. This study aimed to develop a machine learning-based prediction model for predicting multi-gene assay (MGA) risk categories. Patients with estrogen receptor-positive (ER+)/HER2− breast cancer who had undergone Oncotype DX (ODX) or MammaPrint (MMP) were used to develop the prediction model. The development cohort consisted of a total of 2565 patients including 2039 patients tested with ODX and 526 patients tested with MMP. The MMP risk prediction model utilized a single XGBoost model, and the ODX risk prediction model utilized combined LightGBM, CatBoost, and XGBoost models through soft voting. Additionally, the ensemble (MMP + ODX) model combining MMP and ODX utilized CatBoost and XGBoost through soft voting. Ten random samples, corresponding to 10% of the modeling dataset, were extracted, and cross-validation was performed to evaluate the accuracy on each validation set. The accuracy of our predictive models was 84.8% for MMP, 87.9% for ODX, and 86.8% for the ensemble model. In the ensemble cohort, the sensitivity, specificity, and precision for predicting the low-risk category were 0.91, 0.66, and 0.92, respectively. The prediction accuracy exceeded 90% in several subgroups, with the highest prediction accuracy of 95.7% in the subgroup that met Ki-67 <20 and HG 1~2 and premenopausal status. Our machine learning-based predictive model has the potential to complement existing MGAs in ER+/HER2− breast cancer. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
20726694
Volume :
16
Issue :
4
Database :
Academic Search Index
Journal :
Cancers
Publication Type :
Academic Journal
Accession number :
175650768
Full Text :
https://doi.org/10.3390/cancers16040774