51. Contrastive learning improves critical event prediction in COVID-19 patients
- Author
-
Jessica K De Freitas, Jing Zhang, Sulaiman Somani, Hossein Honarvar, Ying Ding, Ariful Azad, Riccardo Miotto, Chengxi Zang, Marinka Zitnik, Fei Wang, Zhangyang Wang, Nidhi Naik, Tingyi Wanyan, Suraj K. Jaladanki, Akhil Vaid, Benjamin S. Glicksberg, Girish N. Nadkarni, and Ishan Paranjpe
- Subjects
Computer science ,business.industry ,Deep learning ,MEDLINE ,General Decision Sciences ,Context (language use) ,Machine learning ,computer.software_genre ,Class (biology) ,Article ,Data set ,Recurrent neural network ,Margin (machine learning) ,Artificial intelligence ,Set (psychology) ,business ,computer - Abstract
Deep Learning (DL) models typically require large-scale, balanced training data to be robust, generalizable, and effective in the context of healthcare. This has been a major issue for developing DL models for the coronavirus-disease 2019 (COVID-19) pandemic where data are highly class imbalanced. Conventional approaches in DL use cross-entropy loss (CEL) which often suffers from poor margin classification. We show that contrastive loss (CL) improves the performance of CEL especially in imbalanced electronic health records (EHR) data for COVID-19 analyses. We use a diverse EHR data set to predict three outcomes: mortality, intubation, and intensive care unit (ICU) transfer in hospitalized COVID-19 patients over multiple time windows. To compare the performance of CEL and CL, models are tested on the full data set and a restricted data set. CL models consistently outperform CEL models with differences ranging from 0.04 to 0.15 for AUPRC and 0.05 to 0.1 for AUROC., Deep learning models applied on EHR data often utilize cross-entropy loss (CEL) as the primary optimization function. But CEL may not be suitable for real-world scenarios with imbalanced data. We develop a learning framework that incorporates both CEL and contrastive loss (CL) to tackle this issue. Our framework achieves better predictive performance and feature interpretability, particularly for imbalanced data.
- Published
- 2021