Back to Search Start Over

Shareable artificial intelligence to extract cancer outcomes from electronic health records for precision oncology research.

Authors :
Kehl KL
Jee J
Pichotta K
Paul MA
Trukhanov P
Fong C
Waters M
Bakouny Z
Xu W
Choueiri TK
Nichols C
Schrag D
Schultz N
Source :
Nature communications [Nat Commun] 2024 Nov 12; Vol. 15 (1), pp. 9787. Date of Electronic Publication: 2024 Nov 12.
Publication Year :
2024

Abstract

Databases that link molecular data to clinical outcomes can inform precision cancer research into novel prognostic and predictive biomarkers. However, outside of clinical trials, cancer outcomes are typically recorded only in text form within electronic health records (EHRs). Artificial intelligence (AI) models have been trained to extract outcomes from individual EHRs. However, patient privacy restrictions have historically precluded dissemination of these models beyond the centers at which they were trained. In this study, the vulnerability of text classification models trained directly on protected health information to membership inference attacks is confirmed. A teacher-student distillation approach is applied to develop shareable models for annotating outcomes from imaging reports and medical oncologist notes. 'Teacher' models trained on EHR data from Dana-Farber Cancer Institute (DFCI) are used to label imaging reports and discharge summaries from the Medical Information Mart for Intensive Care (MIMIC)-IV dataset. 'Student' models are trained to use these MIMIC documents to predict the labels assigned by teacher models and sent to Memorial Sloan Kettering (MSK) for evaluation. The student models exhibit high discrimination across outcomes in both the DFCI and MSK test sets. Leveraging private labeling of public datasets to distill publishable clinical AI models from academic centers could facilitate deployment of machine learning to accelerate precision oncology research.<br />Competing Interests: Competing interests here are no patents related to this research. Dr. Kehl reports funding from the American Association for Cancer Research to his institution related to this research and honoraria from UpToDate and travel sponsored by Meta in the context of a grant submission process unrelated to this research. Dr. Choueiri reports institutional and/or personal, paid and/or unpaid support for research, advisory boards, consultancy, and/or honoraria past 5 years, ongoing or not, from: Alkermes, Arcus Bio, AstraZeneca, Aravive, Aveo, Bayer, Bristol Myers-Squibb, Bicycle Therapeutics, Calithera, Circle Pharma, Deciphera Pharmaceuticals, Eisai, EMD Serono, Exelixis, GlaxoSmithKline, Gilead, HiberCell, IQVA, Infinity, Institut Servier, Ipsen, Jansen, Kanaph, Lilly, Merck, Nikang, Neomorph, Nuscan/PrecedeBio, Novartis, Oncohost, Pfizer, Roche, Sanofi/Aventis, Scholar Rock, Surface Oncology, Takeda, Tempest, Up-To-Date, CME events (Peerview, OncLive, MJH, CCO and others), outside the submitted work. He also reports institutional patents filed on molecular alterations and immunotherapy response/toxicity, and ctDNA. He reports equity in Tempest, Pionyr, Osel, Precede Bio, CureResponse, InnDura Therapeutics, Premium, and Bicycle; committee participation in NCCN, GU Steering Committee, ASCO (BOD 6-2024-, ESMO, ACCRU, KidneyCan). He reports that medical writing and editorial assistance support may have been funded by Communications companies in part. He reports that he has mentored several non-US citizens on research projects with potential funding (in part) from non-US sources/Foreign Components. His institution (Dana-Farber Cancer Institute) may have received additional independent funding of drug companies or/and royalties potentially involved in research around the subject matter. Dr. Bakouny reports Honoraria from UpToDate; serving as Associate Editor at Journal of Clinical Oncology Clinical Cancer Informatics (JCO CCI); serving as co-chair of the American Society of Clinical Oncology’s International Medical Graduate Community of Practice (ASCO IMG CoP); and serving as co-founder of the IMG Oncologists nonprofit non-governmental organization. Dr. Schrag reports funding from AACR to her institution related to this research. Ms. Nichols reports funding from AACR to her institution related to this research. The other authors have no competing interests to disclose.<br /> (© 2024. The Author(s).)

Details

Language :
English
ISSN :
2041-1723
Volume :
15
Issue :
1
Database :
MEDLINE
Journal :
Nature communications
Publication Type :
Academic Journal
Accession number :
39532885
Full Text :
https://doi.org/10.1038/s41467-024-54071-x