6 results on '"Sophia Y. Wang, MD, MS"'
Search Results
2. The Impact of Race, Ethnicity, and Sex on Fairness in Artificial Intelligence for Glaucoma Prediction Models
- Author
-
Rohith Ravindranath, MS, Joshua D. Stein, MD, MS, Tina Hernandez-Boussard, A. Caroline Fisher, Sophia Y. Wang, MD, MS, Sejal Amin, Paul A. Edwards, Divya Srikumaran, Fasika Woreta, Jeffrey S. Schultz, Anurag Shrivastava, Baseer Ahmad, Paul Bryar, Dustin French, Brian L. Vanderbeek, Suzann Pershing, Anne M. Lynch, Jennifer L. Patnaik, Saleha Munir, Wuqaas Munir, Joshua Stein, Lindsey DeLott, Brian C. Stagg, Barbara Wirostko, Brian McMillian, Arsham Sheybani, Soshian Sarrapour, Kristen Nwanyanwu, Michael Deiner, Catherine Sun, Houston: Robert Feldman, and Rajeev Ramachandran
- Subjects
Bias ,Fairness ,Glaucoma ,Health disparities ,Machine learning ,Ophthalmology ,RE1-994 - Abstract
Objective: Despite advances in artificial intelligence (AI) in glaucoma prediction, most works lack multicenter focus and do not consider fairness concerning sex, race, or ethnicity. This study aims to examine the impact of these sensitive attributes on developing fair AI models that predict glaucoma progression to necessitating incisional glaucoma surgery. Design: Database study. Participants: Thirty-nine thousand ninety patients with glaucoma, as identified by International Classification of Disease codes from 7 academic eye centers participating in the Sight OUtcomes Research Collaborative. Methods: We developed XGBoost models using 3 approaches: (1) excluding sensitive attributes as input features, (2) including them explicitly as input features, and (3) training separate models for each group. Model input features included demographic details, diagnosis codes, medications, and clinical information (intraocular pressure, visual acuity, etc.), from electronic health records. The models were trained on patients from 5 sites (N = 27 999) and evaluated on a held-out internal test set (N = 3499) and 2 external test sets consisting of N = 1550 and N = 2542 patients. Main Outcomes and Measures: Area under the receiver operating characteristic curve (AUROC) and equalized odds on the test set and external sites. Results: Six thousand six hundred eighty-two (17.1%) of 39 090 patients underwent glaucoma surgery with a mean age of 70.1 (standard deviation 14.6) years, 54.5% female, 62.3% White, 22.1% Black, and 4.7% Latinx/Hispanic. We found that not including the sensitive attributes led to better classification performance (AUROC: 0.77–0.82) but worsened fairness when evaluated on the internal test set. However, on external test sites, the opposite was true: including sensitive attributes resulted in better classification performance (AUROC: external #1 - [0.73–0.81], external #2 - [0.67–0.70]), but varying degrees of fairness for sex and race as measured by equalized odds. Conclusions: Artificial intelligence models predicting whether patients with glaucoma progress to surgery demonstrated bias with respect to sex, race, and ethnicity. The effect of sensitive attribute inclusion and exclusion on fairness and performance varied based on internal versus external test sets. Prior to deployment, AI models should be evaluated for fairness on the target population. Financial Disclosures: Proprietary or commercial disclosure may be found in the Footnotes and Disclosures at the end of this article.
- Published
- 2025
- Full Text
- View/download PDF
3. Prediction Models for Glaucoma in a Multicenter Electronic Health Records Consortium: The Sight Outcomes Research Collaborative
- Author
-
Sophia Y. Wang, MD, MS, Rohith Ravindranath, MS, Joshua D. Stein, MD, MS, Sejal Amin, Paul A. Edwards, Divya Srikumaran, Fasika Woreta, Jeffrey S. Schultz, Anurag Shrivastava, Baseer Ahmad, Judy Kim, Paul Bryar, Dustin French, Brian L. Vanderbeek, Suzann Pershing, Sophia Y. Wang, Anne M. Lynch, Jenna Patnaik, Saleha Munir, Wuqaas Munir, Joshua Stein, Lindsey DeLott, Brian C. Stagg, Barbara Wirostko, Brian McMillian, and Arsham Sheybani
- Subjects
Machine learning ,Glaucoma ,Multicenter study ,Deep learning ,Ophthalmology ,RE1-994 - Abstract
Purpose: Advances in artificial intelligence have enabled the development of predictive models for glaucoma. However, most work is single-center and uncertainty exists regarding the generalizability of such models. The purpose of this study was to build and evaluate machine learning (ML) approaches to predict glaucoma progression requiring surgery using data from a large multicenter consortium of electronic health records (EHR). Design: Cohort study. Participants: Thirty-six thousand five hundred forty-eight patients with glaucoma, as identified by International Classification of Diseases (ICD) codes from 6 academic eye centers participating in the Sight OUtcomes Research Collaborative (SOURCE). Methods: We developed ML models to predict whether patients with glaucoma would progress to glaucoma surgery in the coming year (identified by Current Procedural Terminology codes) using the following modeling approaches: (1) penalized logistic regression (lasso, ridge, and elastic net); (2) tree-based models (random forest, gradient boosted machines, and XGBoost), and (3) deep learning models. Model input features included demographics, diagnosis codes, medications, and clinical information (intraocular pressure, visual acuity, refractive status, and central corneal thickness) available from structured EHR data. One site was reserved as an “external site” test set (N = 1550); of the patients from the remaining sites, 10% each were randomly selected to be in development and test sets, with the remaining 27 999 reserved for model training. Main Outcome Measures: Evaluation metrics included area under the receiver operating characteristic curve (AUROC) on the test set and the external site. Results: Six thousand nineteen (16.5%) of 36 548 patients underwent glaucoma surgery. Overall, the AUROC ranged from 0.735 to 0.771 on the random test set and from 0.706 to 0.754 on the external test site, with the XGBoost and random forest model performing best, respectively. There was greatest performance decrease from the random test set to the external test site for the penalized regression models. Conclusions: Machine learning models developed using structured EHR data can reasonably predict whether glaucoma patients will need surgery, with reasonable generalizability to an external site. Additional research is needed to investigate the impact of protected class characteristics such as race or gender on model performance and fairness. Financial Disclosure(s): Proprietary or commercial disclosure may be found in the Footnotes and Disclosures at the end of this article.
- Published
- 2024
- Full Text
- View/download PDF
4. Automated Recognition of Visual Acuity Measurements in Ophthalmology Clinical Notes Using Deep Learning
- Author
-
Isaac A. Bernstein, BS, Abigail Koornwinder, BS, Hannah H. Hwang, BS, and Sophia Y. Wang, MD, MS
- Subjects
Deep learning ,Electronic health records ,Natural language processing ,Ophthalmology ,Visual acuity ,RE1-994 - Abstract
Purpose: Visual acuity (VA) is a critical component of the eye examination but is often only documented in electronic health records (EHRs) as unstructured free-text notes, making it challenging to use in research. This study aimed to improve on existing rule-based algorithms by developing and evaluating deep learning models to perform named entity recognition of different types of VA measurements and their lateralities from free-text ophthalmology notes: VA for each of the right and left eyes, with and without glasses correction, and with and without pinhole. Design: Cross-sectional study. Subjects: A total of 319 756 clinical notes with documented VA measurements from approximately 90 000 patients were included. Methods: The notes were split into train, validation, and test sets. Bidirectional Encoder Representations from Transformers (BERT) models were fine-tuned to identify VA measurements from the progress notes and included BERT models pretrained on biomedical literature (BioBERT), critical care EHR notes (ClinicalBERT), both (BlueBERT), and a lighter version of BERT with 40% fewer parameters (DistilBERT). A baseline rule-based algorithm was created to recognize the same VA entities to compare against BERT models. Main Outcome Measures: Model performance was evaluated on a held-out test set using microaveraged precision, recall, and F1 score for all entities. Results: On the human-annotated subset, BlueBERT achieved the best microaveraged F1 score (F1 = 0.92), followed by ClinicalBERT (F1 = 0.91), DistilBERT (F1 = 0.90), BioBERT (F1 = 0.84), and the baseline model (F1 = 0.83). Common errors included labeling VA in sections outside of the examination portion of the note, difficulties labeling current VA alongside a series of past VAs, and missing nonnumeric VAs. Conclusions: This study demonstrates that deep learning models are capable of identifying VA measurements from free-text ophthalmology notes with high precision and recall, achieving significant performance improvements over a rule-based algorithm. The ability to recognize VA from free-text notes would enable a more detailed characterization of ophthalmology patient cohorts and enhance the development of models to predict ophthalmology outcomes. Financial Disclosure(s): Proprietary or commercial disclosure may be found in the Footnotes and Disclosures at the end of this article.
- Published
- 2024
- Full Text
- View/download PDF
5. Predicting Glaucoma Progression to Surgery with Artificial Intelligence Survival Models
- Author
-
Shiqi Tao, MS, Rohith Ravindranath, MS, and Sophia Y. Wang, MD, MS
- Subjects
Artificial intelligence ,Deep learning ,Electronic health records ,Glaucoma ,Machine Learning ,Ophthalmology ,RE1-994 - Abstract
Purpose: Prior artificial intelligence (AI) models for predicting glaucoma progression have used traditional classifiers that do not consider the longitudinal nature of patients’ follow-up. In this study, we developed survival-based AI models for predicting glaucoma patients' progression to surgery, comparing performance of regression-, tree-, and deep learning–based approaches. Design: Retrospective observational study. Subjects: Patients with glaucoma seen at a single academic center from 2008 to 2020 identified from electronic health records (EHRs). Methods: From the EHRs, we identified 361 baseline features, including demographics, eye examinations, diagnoses, and medications. We trained AI survival models to predict patients’ progression to glaucoma surgery using the following: (1) a penalized Cox proportional hazards (CPH) model with principal component analysis (PCA); (2) random survival forests (RSFs); (3) gradient-boosting survival (GBS); and (4) a deep learning model (DeepSurv). The concordance index (C-index) and mean cumulative/dynamic area under the curve (mean AUC) were used to evaluate model performance on a held-out test set. Explainability was investigated using Shapley values for feature importance and visualization of model-predicted cumulative hazard curves for patients with different treatment trajectories. Main Outcome Measures: Progression to glaucoma surgery. Results: Of the 4512 patients with glaucoma, 748 underwent glaucoma surgery, with a median follow-up of 1038 days. The DeepSurv model performed best overall (C-index, 0.775; mean AUC, 0.802) among the models studied in this article (CPH with PCA: C-index, 0.745; mean AUC, 0.780; RSF: C-index, 0.766; mean AUC, 0.804; GBS: C-index, 0.764; mean AUC, 0.791). Predicted cumulative hazard curves demonstrate how models could distinguish between patient who underwent early surgery and patients who underwent surgery after > 3000 days of follow-up or no surgery. Conclusions: Artificial intelligence survival models can predict progression to glaucoma surgery using structured data from EHRs. Tree-based and deep learning-based models performed better at predicting glaucoma progression to surgery than the CPH regression model, potentially because of their better suitability for high-dimensional data sets. Future work predicting ophthalmic outcomes should consider using tree-based and deep learning-based survival AI models. Additional research is needed to develop and evaluate more sophisticated deep learning survival models that can incorporate clinical notes or imaging. Financial Disclosure(s): Proprietary or commercial disclosure may be found in the Footnotes and Disclosures at the end of this article.
- Published
- 2023
- Full Text
- View/download PDF
6. Deep Learning Approaches for Predicting Glaucoma Progression Using Electronic Health Records and Natural Language Processing
- Author
-
Sophia Y. Wang, MD, MS, Benjamin Tseng, and Tina Hernandez-Boussard, PhD
- Subjects
Glaucoma ,Artificial Intelligence ,Deep Learning ,Informatics ,Ophthalmology ,RE1-994 - Abstract
Purpose: Advances in artificial intelligence have produced a few predictive models in glaucoma, including a logistic regression model predicting glaucoma progression to surgery. However, uncertainty exists regarding how to integrate the wealth of information in free-text clinical notes. The purpose of this study was to predict glaucoma progression requiring surgery using deep learning (DL) approaches on data from electronic health records (EHRs), including features from structured clinical data and from natural language processing of clinical free-text notes. Design: Development of DL predictive model in an observational cohort. Participants: Adult patients with glaucoma at a single center treated from 2008 through 2020. Methods: Ophthalmology clinical notes of patients with glaucoma were identified from EHRs. Available structured data included patient demographic information, diagnosis codes, prior surgeries, and clinical information including intraocular pressure, visual acuity, and central corneal thickness. In addition, words from patients’ first 120 days of notes were mapped to ophthalmology domain-specific neural word embeddings trained on PubMed ophthalmology abstracts. Word embeddings and structured clinical data were used as inputs to DL models to predict subsequent glaucoma surgery. Main Outcome Measures: Evaluation metrics included area under the receiver operating characteristic curve (AUC) and F1 score, the harmonic mean of positive predictive value, and sensitivity on a held-out test set. Results: Seven hundred forty-eight of 4512 patients with glaucoma underwent surgery. The model that incorporated both structured clinical features as well as input features from clinical notes achieved an AUC of 73% and F1 of 40%, compared with only structured clinical features, (AUC, 66%; F1, 34%) and only clinical free-text features (AUC, 70%; F1, 42%). All models outperformed predictions from a glaucoma specialist’s review of clinical notes (F1, 29.5%). Conclusions: We can successfully predict which patients with glaucoma will need surgery using DL models on EHRs unstructured text. Models incorporating free-text data outperformed those using only structured inputs. Future predictive models using EHRs should make use of information from within clinical free-text notes to improve predictive performance. Additional research is needed to investigate optimal methods of incorporating imaging data into future predictive models as well.
- Published
- 2022
- Full Text
- View/download PDF
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.