14 results on '"Jessica L. Gronsbell"'
Search Results
2. A tutorial on fairness in machine learning in healthcare.
- Author
-
Jianhui Gao, Benson Chou, Zachary R. McCaw, Hilary Thurston, Paul Varghese, Chuan Hong, and Jessica L. Gronsbell
- Published
- 2024
- Full Text
- View/download PDF
3. Machine learning approaches for electronic health records phenotyping: a methodical review.
- Author
-
Siyue Yang, Paul Varghese, Ellen Stephenson, Karen Tu, and Jessica L. Gronsbell
- Published
- 2023
- Full Text
- View/download PDF
4. Marginal Structural Models Using Calibrated Weights With SuperLearner: Application to Type II Diabetes Cohort.
- Author
-
Sumeet Kalia, Olli Saarela, Tao Chen, Braden O'Neill, Christopher Meaney, Jessica L. Gronsbell, Ervin Sejdic, Michael D. Escobar, Babak Aliarzadeh, Rahim Moineddin, Conrad Pow, Frank M. Sullivan, and Michelle Greiver
- Published
- 2022
- Full Text
- View/download PDF
5. High-throughput multimodal automated phenotyping (MAP) with application to PheWAS.
- Author
-
Katherine P. Liao, Jiehuan Sun, Tianrun A. Cai, Nicholas B. Link, Chuan Hong, Jie Huang 0030, Jennifer E. Huffman, Jessica L. Gronsbell, Yichi Zhang, Yuk-Lam Ho, Victor M. Castro, Vivian S. Gainer, Shawn N. Murphy, Christopher J. O'Donnell, J. Michael Gaziano, Kelly Cho, Peter Szolovits, Isaac S. Kohane, and Sheng Yu 0002
- Published
- 2019
- Full Text
- View/download PDF
6. Efficient Estimation and Evaluation of Prediction Rules in Semi-Supervised Settings under Stratified Sampling.
- Author
-
Jessica L. Gronsbell, Molei Liu, Lu Tian, and Tianxi Cai
- Published
- 2020
7. Enabling phenotypic big data with PheNorm.
- Author
-
Sheng Yu 0002, Yumeng Ma, Jessica L. Gronsbell, Tianrun A. Cai, Ashwin N. Ananthakrishnan, Vivian S. Gainer, Susanne E. Churchill, Peter Szolovits, Shawn N. Murphy, Isaac S. Kohane, Katherine P. Liao, and Tianxi Cai
- Published
- 2018
- Full Text
- View/download PDF
8. High-Throughput Multimodal Automated Phenotyping (MAP) Incorporating Natural Language Processing with Application to PheWAS.
- Author
-
Katherine P. Liao, Jiehuan Sun, Tianrun A. Cai, Nicholas B. Link, Chuan Hong, Jie Huang 0030, Jennifer E. Huffman, Jessica L. Gronsbell, Lauren Costa, Victor M. Castro, Vivian S. Gainer, Shawn N. Murphy, J. Michael Gaziano, Kelly Cho, Peter Szolovits, Isaac S. Kohane, Sheng Yu 0002, and Tianxi Cai
- Published
- 2018
9. High-throughput Phenotyping via Denoised Normal Mixture Transformation.
- Author
-
Sheng Yu 0002, Yumeng Ma, Jessica L. Gronsbell, Katherine P. Liao, Tianrun A. Cai, Ashwin N. Ananthakrishnan, Vivian S. Gainer, Susanne E. Churchill, Peter Szolovits, Shawn N. Murphy, Isaac S. Kohane, and Tianxi Cai
- Published
- 2017
10. Semi-Supervised Approaches to Efficient Evaluation of Model Prediction Performance
- Author
-
Jessica L. Gronsbell and Tianxi Cai
- Subjects
FOS: Computer and information sciences ,0301 basic medicine ,Statistics and Probability ,Semi-supervised learning ,Overfitting ,Machine learning ,computer.software_genre ,01 natural sciences ,Methodology (stat.ME) ,010104 statistics & probability ,03 medical and health sciences ,Resampling ,0101 mathematics ,Statistics - Methodology ,Mathematics ,Receiver operating characteristic ,business.industry ,Estimator ,Regression analysis ,Delta method ,030104 developmental biology ,Binary classification ,Artificial intelligence ,Statistics, Probability and Uncertainty ,business ,computer - Abstract
Summary In many modern machine learning applications, the outcome is expensive or time consuming to collect whereas the predictor information is easy to obtain. Semi-supervised (SS) learning aims at utilizing large amounts of ‘unlabelled’ data along with small amounts of ‘labelled’ data to improve the efficiency of a classical supervised approach. Though numerous SS learning classification and prediction procedures have been proposed in recent years, no methods currently exist to evaluate the prediction performance of a working regression model. In the context of developing phenotyping algorithms derived from electronic medical records, we present an efficient two-step estimation procedure for evaluating a binary classifier based on various prediction performance measures in the SS setting. In step I, the labelled data are used to obtain a non-parametrically calibrated estimate of the conditional risk function. In step II, SS estimates of the prediction accuracy parameters are constructed based on the estimated conditional risk function and the unlabelled data. We demonstrate that, under mild regularity conditions, the estimators proposed are consistent and asymptotically normal. Importantly, the asymptotic variance of the SS estimators is always smaller than that of the supervised counterparts under correct model specification. We also correct for potential overfitting bias in the SS estimators in finite samples with cross-validation and we develop a perturbation resampling procedure to approximate their distributions. Our proposals are evaluated through extensive simulation studies and illustrated with two real electronic medical record studies aiming to develop phenotyping algorithms for rheumatoid arthritis and multiple sclerosis.
- Published
- 2017
11. Enabling phenotypic big data with PheNorm
- Author
-
Yumeng Ma, Sheng Yu, Shawn N. Murphy, Ashwin N. Ananthakrishnan, Susanne Churchill, Katherine P. Liao, Jessica L. Gronsbell, Vivian S. Gainer, Isaac S. Kohane, Peter Szolovits, Tianxi Cai, and Tianrun Cai
- Subjects
Big Data ,0301 basic medicine ,Computer science ,Big data ,Datasets as Topic ,Health Informatics ,Research and Applications ,Machine learning ,computer.software_genre ,Bottleneck ,Set (abstract data type) ,03 medical and health sciences ,Annotation ,0302 clinical medicine ,International Classification of Diseases ,Electronic Health Records ,Humans ,Mixture distribution ,030212 general & internal medicine ,Precision Medicine ,Receiver operating characteristic ,business.industry ,Phenotype ,030104 developmental biology ,Feature (computer vision) ,Sample size determination ,Area Under Curve ,Intercellular Signaling Peptides and Proteins ,Artificial intelligence ,Peptides ,business ,computer ,Algorithms - Abstract
Objective Electronic health record (EHR)-based phenotyping infers whether a patient has a disease based on the information in his or her EHR. A human-annotated training set with gold-standard disease status labels is usually required to build an algorithm for phenotyping based on a set of predictive features. The time intensiveness of annotation and feature curation severely limits the ability to achieve high-throughput phenotyping. While previous studies have successfully automated feature curation, annotation remains a major bottleneck. In this paper, we present PheNorm, a phenotyping algorithm that does not require expert-labeled samples for training. Methods The most predictive features, such as the number of International Classification of Diseases, Ninth Revision, Clinical Modification (ICD-9-CM) codes or mentions of the target phenotype, are normalized to resemble a normal mixture distribution with high area under the receiver operating curve (AUC) for prediction. The transformed features are then denoised and combined into a score for accurate disease classification. Results We validated the accuracy of PheNorm with 4 phenotypes: coronary artery disease, rheumatoid arthritis, Crohn’s disease, and ulcerative colitis. The AUCs of the PheNorm score reached 0.90, 0.94, 0.95, and 0.94 for the 4 phenotypes, respectively, which were comparable to the accuracy of supervised algorithms trained with sample sizes of 100–300, with no statistically significant difference. Conclusion The accuracy of the PheNorm algorithms is on par with algorithms trained with annotated samples. PheNorm fully automates the generation of accurate phenotyping algorithms and demonstrates the capacity for EHR-driven annotations to scale to the next level – phenotypic big data.
- Published
- 2017
12. High-throughput Multimodal Automated Phenotyping (MAP) with Application to PheWAS
- Author
-
Yuk Lam Ho, Vivian S. Gainer, Peter Szolovits, Kelly Cho, Tianrun Cai, Christopher J. O'Donnell, Jie Huang, Chuan Hong, Isaac S. Kohane, Sheng Yu, Victor M. Castro, Jennifer E. Huffman, Nicholas Link, Shawn N. Murphy, Yichi Zhang, J. Michael Gaziano, Jiehuan Sun, Katherine P. Liao, Tianxi Cai, and Jessica L. Gronsbell
- Subjects
0301 basic medicine ,Computer science ,Health Informatics ,Health records ,Research and Applications ,computer.software_genre ,Polymorphism, Single Nucleotide ,Manual curation ,03 medical and health sciences ,0302 clinical medicine ,International Classification of Diseases ,Electronic Health Records ,Humans ,030212 general & internal medicine ,Difference-map algorithm ,Throughput (business) ,Natural Language Processing ,030304 developmental biology ,0303 health sciences ,Unified Medical Language System ,Mixture model ,Phenotype ,030104 developmental biology ,Biorepository ,Area Under Curve ,Scalability ,Cohort ,Labeled data ,Data mining ,Scale (map) ,computer ,Algorithms - Abstract
Objective Electronic health records linked with biorepositories are a powerful platform for translational studies. A major bottleneck exists in the ability to phenotype patients accurately and efficiently. The objective of this study was to develop an automated high-throughput phenotyping method integrating International Classification of Diseases (ICD) codes and narrative data extracted using natural language processing (NLP). Materials and Methods We developed a mapping method for automatically identifying relevant ICD and NLP concepts for a specific phenotype leveraging the Unified Medical Language System. Along with health care utilization, aggregated ICD and NLP counts were jointly analyzed by fitting an ensemble of latent mixture models. The multimodal automated phenotyping (MAP) algorithm yields a predicted probability of phenotype for each patient and a threshold for classifying participants with phenotype yes/no. The algorithm was validated using labeled data for 16 phenotypes from a biorepository and further tested in an independent cohort phenome-wide association studies (PheWAS) for 2 single nucleotide polymorphisms with known associations. Results The MAP algorithm achieved higher or similar AUC and F-scores compared to the ICD code across all 16 phenotypes. The features assembled via the automated approach had comparable accuracy to those assembled via manual curation (AUCMAP 0.943, AUCmanual 0.941). The PheWAS results suggest that the MAP approach detected previously validated associations with higher power when compared to the standard PheWAS method based on ICD codes. Conclusion The MAP approach increased the accuracy of phenotype definition while maintaining scalability, thereby facilitating use in studies requiring large-scale phenotyping, such as PheWAS.
- Published
- 2019
13. Common First-Pass CT Angiography Findings Associated With Rapid Growth Rate in Abdominal Aorta Aneurysms Between 3 and 5 cm in Largest Diameter
- Author
-
Michael L. Steigner, Jessica L. Gronsbell, Andreas A. Giannopoulos, Elizabeth George, Dimitrios Mitsouras, Ayaz Aghayev, Tianxi Cai, and Frank J. Rybicki
- Subjects
Adult ,Male ,medicine.medical_specialty ,Computed Tomography Angiography ,Lumen (anatomy) ,Comorbidity ,030204 cardiovascular system & hematology ,030218 nuclear medicine & medical imaging ,03 medical and health sciences ,0302 clinical medicine ,Aneurysm ,medicine.artery ,Medicine ,Intraluminal thrombus ,Humans ,Radiology, Nuclear Medicine and imaging ,In patient ,cardiovascular diseases ,Aged ,Retrospective Studies ,First pass ,Aged, 80 and over ,medicine.diagnostic_test ,business.industry ,Abdominal aorta ,Retrospective cohort study ,General Medicine ,Middle Aged ,medicine.disease ,Cross-Sectional Studies ,Angiography ,cardiovascular system ,Disease Progression ,Female ,Radiology ,business ,Aortic Aneurysm, Abdominal - Abstract
The purpose of this study was to describe CT angiography (CTA) findings of lumen contrast heterogeneity and intraluminal thrombus volume and to evaluate their relationship with rapid aneurysm growth in patients with abdominal aortic aneurysms (AAA) between 3 and 5 cm.This institutional review board-approved and HIPAA-compliant single-center retrospective study included CTA studies obtained between January 2004 and December 2014 in 140 patients with AAA (101 men, 39 women; mean age ± SD, 70 ± 9 years old; age range, 22-87 years old). Standardized measurements for aneurysm intraluminal thrombus volume and a relatively new metric termed "lumen contrast heterogeneity" were obtained from the CTA images. AAA growth rate data were acquired from all subsequent cross-sectional studies. The association between the imaging findings and rapid aneurysm growth (0.4 cm/y) was evaluated using multivariate logistic regression. Patient comorbidities and medications were added to the regression model to assess for further associations with rapid growth rate.Using a baseline logistic regression model, lumen contrast heterogeneity (odds ratio [OR], 1.16; 95% CI, 1.05-1.32), intraluminal thrombus volume (OR, 2.15; 95% CI, 1.26-3.86), and maximum AAA diameter (OR, 1.69; 95% CI, 1.03-2.84) were independently associated with increased likelihood of rapid aneurysm growth. None of the patient comorbidities or medications were significantly associated with the outcome when added to the baseline model.Both intraluminal thrombus and lumen contrast heterogeneity are seen on AAA CTA studies and can be quantified; both of these metrics are independently associated with rapid growth rate and should be recognized by radiologists evaluating patients with AAA during surveillance.
- Published
- 2017
14. Limited Hospital Variation in the Use and Yield of CT for Pulmonary Embolism in Patients Undergoing Total Hip or Total Knee Replacement Surgery
- Author
-
Hiraku Kumamaru, Jessica L. Gronsbell, Brian T. Bateman, Kuni Ohtomo, Elisabetta Patorno, Laurence D. Higgins, Frank J. Rybicki, Jun Liu, Tianxi Cai, Shigeki Aoki, and Kanako K. Kumamaru
- Subjects
Male ,medicine.medical_specialty ,Intraclass correlation ,Arthroplasty, Replacement, Hip ,Logistic regression ,030218 nuclear medicine & medical imaging ,03 medical and health sciences ,0302 clinical medicine ,Postoperative Complications ,Interquartile range ,medicine ,Pulmonary angiography ,Humans ,Radiology, Nuclear Medicine and imaging ,030212 general & internal medicine ,Arthroplasty, Replacement, Knee ,Aged ,Retrospective Studies ,Postoperative Care ,business.industry ,Retrospective cohort study ,Middle Aged ,medicine.disease ,Random effects model ,United States ,Pulmonary embolism ,Hospitalization ,Cohort ,Female ,Radiology ,business ,Pulmonary Embolism ,Tomography, X-Ray Computed - Abstract
Purpose To evaluate the variation among U.S. hospitals in overall use and yield of in-hospital computed tomographic (CT) pulmonary angiography (PA) in patients undergoing total hip replacement (THR) or total knee replacement (TKR) surgery. Materials and Methods Patients in the Premier Research Database who underwent elective TKR or THR between 2007 and 2011 were enrolled in this HIPAA-compliant, institutional review board-approved retrospective observational study. The informed consent requirement was waived. Hospitals were categorized into low, medium, and high tertiles of CT PA use to compare baseline patient- and hospital-level characteristics and pulmonary embolism (PE) positivity rates. To further investigate between-hospital variation in CT PA use, a hierarchical logistic regression model that included hospital-specific random effects and fixed patient- and hospital-level effects was used. The intraclass correlation coefficient (ICC) was used to measure the amount of variability in CT PA use attributable to between-hospital variation. Results The cohort included 205 198 patients discharged from 178 hospitals (median of 734.5 patients discharged per hospital; interquartile range, 316-1461 patients) with 3647 CT PA studies (1.8%). The crude frequency of CT PA scans among the hospitals ranged from 0% to 6.2% (median, 1.6%); more than 90% of the hospitals performed CT PA in less than 3% of their patients. The mean hospital-level PE positivity rate was 12.3% (median, 9.1%); there was no significant difference in PE positivity rate across low through high CT PA use tertiles (11.3%, 11.9%, 12.9%, P = .37). After adjustment for hospital- and patient-level factors, the remaining amount of interhospital variation was relatively low (ICC, 9.0%). Conclusion Limited interhospital variation in use and yield of in-hospital CT PA was observed among patients undergoing TKR or THR in the United States. © RSNA, 2016 Online supplemental material is available for this article.
- Published
- 2016
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.