Descriptor: "Electronic Health Records" / Journal: journal of the american medical informatics association - Searchworks@Jio Institute Digital Library Search Results

Your search keyword '"Electronic Health Records"' showing total 965 results

Start Over Descriptor "Electronic Health Records" Journal journal of the american medical informatics association

965 results on '"Electronic Health Records"'

1. A comparative study of large language model-based zero-shot inference and task-specific supervised classification of breast cancer pathology reports

Author: Sushil, Madhumita, Zack, Travis, Mandair, Divneet, Zheng, Zhiwei, Wali, Ahmed, Yu, Yan-Ning, Quan, Yuwei, Lituiev, Dmytro, and Butte, Atul J
Subjects: Information and Computing Sciences, Machine Learning, Bioengineering, Cancer, Women's Health, Networking and Information Technology R&D (NITRD), Breast Cancer, 2.5 Research design and methodologies (aetiology), Humans, Breast Neoplasms, Female, Supervised Machine Learning, Natural Language Processing, Datasets as Topic, Electronic Health Records, Data Mining, electronic health records, large language models, breast cancer, pathology, natural language processing, Engineering, Medical and Health Sciences, Medical Informatics, Biomedical and clinical sciences, Health sciences, Information and computing sciences
Abstract: ObjectiveAlthough supervised machine learning is popular for information extraction from clinical notes, creating large annotated datasets requires extensive domain expertise and is time-consuming. Meanwhile, large language models (LLMs) have demonstrated promising transfer learning capability. In this study, we explored whether recent LLMs could reduce the need for large-scale data annotations.Materials and methodsWe curated a dataset of 769 breast cancer pathology reports, manually labeled with 12 categories, to compare zero-shot classification capability of the following LLMs: GPT-4, GPT-3.5, Starling, and ClinicalCamel, with task-specific supervised classification performance of 3 models: random forests, long short-term memory networks with attention (LSTM-Att), and the UCSF-BERT model.ResultsAcross all 12 tasks, the GPT-4 model performed either significantly better than or as well as the best supervised model, LSTM-Att (average macro F1-score of 0.86 vs 0.75), with advantage on tasks with high label imbalance. Other LLMs demonstrated poor performance. Frequent GPT-4 error categories included incorrect inferences from multiple samples and from history, and complex task design, and several LSTM-Att errors were related to poor generalization to the test set.DiscussionOn tasks where large annotated datasets cannot be easily collected, LLMs can reduce the burden of data labeling. However, if the use of LLMs is prohibitive, the use of simpler models with large annotated datasets can provide comparable results.ConclusionsGPT-4 demonstrated the potential to speed up the execution of clinical NLP studies by reducing the need for large annotated datasets. This may increase the utilization of NLP-based variables and outcomes in clinical studies.
Published: 2024

2. Biomedical blockchain with practical implementations and quantitative evaluations: a systematic review

Author: Lacson, Roger, Yu, Yufei, Kuo, Tsung-Ting, and Ohno-Machado, Lucila
Subjects: Health Services and Systems, Health Sciences, Information and Computing Sciences, Data Management and Data Science, Distributed Computing and Systems Software, Networking and Information Technology R&D (NITRD), Generic health relevance, Good Health and Well Being, Blockchain, Humans, Information Dissemination, COVID-19, blockchain, biomedical, electronic health records, implementation, evaluation, Engineering, Medical and Health Sciences, Medical Informatics, Biomedical and clinical sciences, Health sciences, Information and computing sciences
Abstract: ObjectiveBlockchain has emerged as a potential data-sharing structure in healthcare because of its decentralization, immutability, and traceability. However, its use in the biomedical domain is yet to be investigated comprehensively, especially from the aspects of implementation and evaluation, by existing blockchain literature reviews. To address this, our review assesses blockchain applications implemented in practice and evaluated with quantitative metrics.Materials and methodsThis systematic review adapts the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) framework to review biomedical blockchain papers published by August 2023 from 3 databases. Blockchain application, implementation, and evaluation metrics were collected and summarized.ResultsFollowing screening, 11 articles were included in this review. Articles spanned a range of biomedical applications including COVID-19 medical data sharing, decentralized internet of things (IoT) data storage, clinical trial management, biomedical certificate storage, electronic health record (EHR) data sharing, and distributed predictive model generation. Only one article demonstrated blockchain deployment at a medical facility.DiscussionEthereum was the most common blockchain platform. All but one implementation was developed with private network permissions. Also, 8 articles contained storage speed metrics and 6 contained query speed metrics. However, inconsistencies in presented metrics and the small number of articles included limit technological comparisons with each other.ConclusionWhile blockchain demonstrates feasibility for adoption in healthcare, it is not as popular as currently existing technologies for biomedical data management. Addressing implementation and evaluation factors will better showcase blockchain's practical benefits, enabling blockchain to have a significant impact on the health sector.
Published: 2024

3. "Goldmine" or "big mess"? An interview study on the challenges of designing, operating, and ensuring the durability of Clinical Data Warehouses in France and Belgium.

Author: Priou, Sonia, Kempf, Emmanuelle, Jankovic, Marija, and Lamé, Guillaume
Abstract: Objectives Clinical Data Warehouses (CDW) are the designated infrastructures to enable access and analysis of large quantities of electronic health record data. Building and managing such systems implies extensive "data work" and coordination between multiple stakeholders. Our study focuses on the challenges these stakeholders face when designing, operating, and ensuring the durability of CDWs for research. Materials and Methods We conducted semistructured interviews with 21 professionals working with CDWs from France and Belgium. All interviews were recorded, transcribed verbatim, and coded inductively. Results Prompted by the AI boom, healthcare institutions launched initiatives to repurpose data they were generating for care without a clear vision of how to generate value. Difficulties in operating CDWs arose quickly, strengthened by the multiplicity and diversity of stakeholders involved and grand discourses on the possibilities of CDWs, disjointed from their actual capabilities. Without proper management of the information flows, stakeholders struggled to build a shared vision. This was evident in our interviewees' contrasting appreciations of what mattered most to ensure data quality. Participants explained they struggled to manage knowledge inside and across institutions, generating knowledge loss, repeated mistakes, and impeding progress locally and nationally. Discussion and conclusion Management issues strongly affect the deployment and operation of CDWs. This may stem from a simplistic linear vision of how this type of infrastructure operates. CDWs remain promising for research, and their design, implementation, and operation require careful management if they are to be successful. Building on innovation management, complex systems, and organizational learning knowledge will help. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

4. CACER: Clinical concept Annotations for Cancer Events and Relations.

Author: Fu, Yujuan Velvin, Ramachandran, Giridhar Kaushik, Halwani, Ahmad, McInnes, Bridget T, Xia, Fei, Lybarger, Kevin, Yetisgen, Meliha, and Uzuner, Özlem
Abstract: Objective Clinical notes contain unstructured representations of patient histories, including the relationships between medical problems and prescription drugs. To investigate the relationship between cancer drugs and their associated symptom burden, we extract structured, semantic representations of medical problem and drug information from the clinical narratives of oncology notes. Materials and Methods We present Clinical concept Annotations for Cancer Events and Relations (CACER), a novel corpus with fine-grained annotations for over 48 000 medical problems and drug events and 10 000 drug-problem and problem-problem relations. Leveraging CACER, we develop and evaluate transformer-based information extraction models such as Bidirectional Encoder Representations from Transformers (BERT), Fine-tuned Language Net Text-To-Text Transfer Transformer (Flan-T5), Large Language Model Meta AI (Llama3), and Generative Pre-trained Transformers-4 (GPT-4) using fine-tuning and in-context learning (ICL). Results In event extraction, the fine-tuned BERT and Llama3 models achieved the highest performance at 88.2-88.0 F1, which is comparable to the inter-annotator agreement (IAA) of 88.4 F1. In relation extraction, the fine-tuned BERT, Flan-T5, and Llama3 achieved the highest performance at 61.8-65.3 F1. GPT-4 with ICL achieved the worst performance across both tasks. Discussion The fine-tuned models significantly outperformed GPT-4 in ICL, highlighting the importance of annotated training data and model optimization. Furthermore, the BERT models performed similarly to Llama3. For our task, large language models offer no performance advantage over the smaller BERT models. Conclusions We introduce CACER, a novel corpus with fine-grained annotations for medical problems, drugs, and their relationships in clinical narratives of oncology notes. State-of-the-art transformer models achieved performance comparable to IAA for several extraction tasks. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

5. Implementation and delivery of electronic health records training programs for nurses working in inpatient settings: a scoping review.

Author: Nguyen, Oliver T, Vo, Steven D, Lee, Taeheon, Cato, Kenrick D, and Cho, Hwayoung
Abstract: Objectives Well-designed electronic health records (EHRs) training programs for clinical practice are known to be valuable. Training programs should be role-specific and there is a need to identify key implementation factors of EHR training programs for nurses. This scoping review (1) characterizes the EHR training programs used and (2) identifies their implementation facilitators and barriers. Materials and Methods We searched MEDLINE, CINAHL, PsycINFO, and Web of Science on September 3, 2023, for peer-reviewed articles that described EHR training program implementation or delivery to nurses in inpatient settings without any date restrictions. We mapped implementation factors to the Consolidated Framework for Implementation Research. Additional themes were inductively identified by reviewing these findings. Results This review included 30 articles. Healthcare systems' approaches to implementing and delivering EHR training programs were highly varied. For implementation factors, we observed themes in innovation (eg, ability to practice EHR skills after training is over, personalizing training, training pace), inner setting (eg, availability of computers, clear documentation requirements and expectations), individual (eg, computer literacy, learning preferences), and implementation process (eg, trainers and support staff hold nursing backgrounds, establishing process for dissemination of EHR updates). No themes in the outer setting were observed. Discussion We found that multilevel factors can influence the implementation and delivery of EHR training programs for inpatient nurses. Several areas for future research were identified, such as evaluating nurse preceptorship models and developing training programs for ongoing EHR training (eg, in response to new EHR workflows or features). Conclusions This scoping review highlighted numerous factors pertaining to training interventions, healthcare systems, and implementation approaches. Meanwhile, it is unclear how external factors outside of a healthcare system influence EHR training programs. Additional studies are needed that focus on EHR retraining programs, comparing outcomes of different training models, and how to effectively disseminate updates with the EHR to nurses. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

6. Reliable generation of privacy-preserving synthetic electronic health record time series via diffusion models.

Author: Tian, Muhang, Chen, Bernie, Guo, Allan, Jiang, Shiyi, and Zhang, Anru R
Abstract: Objective Electronic health records (EHRs) are rich sources of patient-level data, offering valuable resources for medical data analysis. However, privacy concerns often restrict access to EHRs, hindering downstream analysis. Current EHR deidentification methods are flawed and can lead to potential privacy leakage. Additionally, existing publicly available EHR databases are limited, preventing the advancement of medical research using EHR. This study aims to overcome these challenges by generating realistic and privacy-preserving synthetic EHRs time series efficiently. Materials and Methods We introduce a new method for generating diverse and realistic synthetic EHR time series data using denoizing diffusion probabilistic models. We conducted experiments on 6 databases: Medical Information Mart for Intensive Care III and IV, the eICU Collaborative Research Database (eICU), and non-EHR datasets on Stocks and Energy. We compared our proposed method with 8 existing methods. Results Our results demonstrate that our approach significantly outperforms all existing methods in terms of data fidelity while requiring less training effort. Additionally, data generated by our method yield a lower discriminative accuracy compared to other baseline methods, indicating the proposed method can generate data with less privacy risk. Discussion The proposed model utilizes a mixed diffusion process to generate realistic synthetic EHR samples that protect patient privacy. This method could be useful in tackling data availability issues in the field of healthcare by reducing barrier to EHR access and supporting research in machine learning for health. Conclusion The proposed diffusion model-based method can reliably and efficiently generate synthetic EHR time series, which facilitates the downstream medical data analysis. Our numerical results show the superiority of the proposed method over all other existing methods. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

7. Towards cross-application model-agnostic federated cohort discovery.

Author: Dobbins, Nicholas J, Morris, Michele, Sadhu, Eugene, MacFadden, Douglas, Nazaire, Marc-Danie, Simons, William, Weber, Griffin, Murphy, Shawn, and Visweswaran, Shyam
Abstract: Objectives To demonstrate that 2 popular cohort discovery tools, Leaf and the Shared Health Research Information Network (SHRINE), are readily interoperable. Specifically, we adapted Leaf to interoperate and function as a node in a federated data network that uses SHRINE and dynamically generate queries for heterogeneous data models. Materials and Methods SHRINE queries are designed to run on the Informatics for Integrating Biology & the Bedside (i2b2) data model. We created functionality in Leaf to interoperate with a SHRINE data network and dynamically translate SHRINE queries to other data models. We randomly selected 500 past queries from the SHRINE-based national Evolve to Next-Gen Accrual to Clinical Trials (ENACT) network for evaluation, and an additional 100 queries to refine and debug Leaf's translation functionality. We created a script for Leaf to convert the terms in the SHRINE queries into equivalent structured query language (SQL) concepts, which were then executed on 2 other data models. Results and Discussion 91.1% of the generated queries for non-i2b2 models returned counts within 5% (or ±5 patients for counts under 100) of i2b2, with 91.3% recall. Of the 8.9% of queries that exceeded the 5% margin, 77 of 89 (86.5%) were due to errors introduced by the Python script or the extract-transform-load process, which are easily fixed in a production deployment. The remaining errors were due to Leaf's translation function, which was later fixed. Conclusion Our results support that cohort discovery applications such as Leaf and SHRINE can interoperate in federated data networks with heterogeneous data models. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

8. Extraction of sleep information from clinical notes of Alzheimer's disease patients using natural language processing.

Author: Sivarajkumar, Sonish, Tam, Thomas Yu Chow, Mohammad, Haneef Ahamed, Viggiano, Samuel, Oniani, David, Visweswaran, Shyam, and Wang, Yanshan
Abstract: Objectives Alzheimer's disease (AD) is the most common form of dementia in the United States. Sleep is one of the lifestyle-related factors that has been shown critical for optimal cognitive function in old age. However, there is a lack of research studying the association between sleep and AD incidence. A major bottleneck for conducting such research is that the traditional way to acquire sleep information is time-consuming, inefficient, non-scalable, and limited to patients' subjective experience. We aim to automate the extraction of specific sleep-related patterns, such as snoring, napping, poor sleep quality, daytime sleepiness, night wakings, other sleep problems, and sleep duration, from clinical notes of AD patients. These sleep patterns are hypothesized to play a role in the incidence of AD, providing insight into the relationship between sleep and AD onset and progression. Materials and Methods A gold standard dataset is created from manual annotation of 570 randomly sampled clinical note documents from the adSLEEP, a corpus of 192 000 de-identified clinical notes of 7266 AD patients retrieved from the University of Pittsburgh Medical Center (UPMC). We developed a rule-based natural language processing (NLP) algorithm, machine learning models, and large language model (LLM)-based NLP algorithms to automate the extraction of sleep-related concepts, including snoring, napping, sleep problem, bad sleep quality, daytime sleepiness, night wakings, and sleep duration, from the gold standard dataset. Results The annotated dataset of 482 patients comprised a predominantly White (89.2%), older adult population with an average age of 84.7 years, where females represented 64.1%, and a vast majority were non-Hispanic or Latino (94.6%). Rule-based NLP algorithm achieved the best performance of F1 across all sleep-related concepts. In terms of positive predictive value (PPV), the rule-based NLP algorithm achieved the highest PPV scores for daytime sleepiness (1.00) and sleep duration (1.00), while the machine learning models had the highest PPV for napping (0.95) and bad sleep quality (0.86), and LLAMA2 with finetuning had the highest PPV for night wakings (0.93) and sleep problem (0.89). Discussion Although sleep information is infrequently documented in the clinical notes, the proposed rule-based NLP algorithm and LLM-based NLP algorithms still achieved promising results. In comparison, the machine learning-based approaches did not achieve good results, which is due to the small size of sleep information in the training data. Conclusion The results show that the rule-based NLP algorithm consistently achieved the best performance for all sleep concepts. This study focused on the clinical notes of patients with AD but could be extended to general sleep information extraction for other diseases. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

9. Disambiguation of acronyms in clinical narratives with large language models.

Author: Kugic, Amila, Schulz, Stefan, and Kreuzthaler, Markus
Abstract: Objective To assess the performance of large language models (LLMs) for zero-shot disambiguation of acronyms in clinical narratives. Materials and Methods Clinical narratives in English, German, and Portuguese were applied for testing the performance of four LLMs: GPT-3.5, GPT-4, Llama-2-7b-chat, and Llama-2-70b-chat. For English, the anonymized Clinical Abbreviation Sense Inventory (CASI, University of Minnesota) was used. For German and Portuguese, at least 500 text spans were processed. The output of LLM models, prompted with contextual information, was analyzed to compare their acronym disambiguation capability, grouped by document-level metadata, the source language, and the LLM. Results On CASI, GPT-3.5 achieved 0.91 in accuracy. GPT-4 outperformed GPT-3.5 across all datasets, reaching 0.98 in accuracy for CASI, 0.86 and 0.65 for two German datasets, and 0.88 for Portuguese. Llama models only reached 0.73 for CASI and failed severely for German and Portuguese. Across LLMs, performance decreased from English to German and Portuguese processing languages. There was no evidence that additional document-level metadata had a significant effect. Conclusion For English clinical narratives, acronym resolution by GPT-4 can be recommended to improve readability of clinical text by patients and professionals. For German and Portuguese, better models are needed. Llama models, which are particularly interesting for processing sensitive content on premise, cannot yet be recommended for acronym resolution. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

10. Large language models facilitate the generation of electronic health record phenotyping algorithms.

Author: Yan, Chao, Ong, Henry H, Grabowska, Monika E, Krantz, Matthew S, Su, Wu-Chen, Dickson, Alyson L, Peterson, Josh F, Feng, QiPing, Roden, Dan M, Stein, C Michael, Kerchberger, V Eric, Malin, Bradley A, and Wei, Wei-Qi
Abstract: Objectives Phenotyping is a core task in observational health research utilizing electronic health records (EHRs). Developing an accurate algorithm demands substantial input from domain experts, involving extensive literature review and evidence synthesis. This burdensome process limits scalability and delays knowledge discovery. We investigate the potential for leveraging large language models (LLMs) to enhance the efficiency of EHR phenotyping by generating high-quality algorithm drafts. Materials and Methods We prompted four LLMs—GPT-4 and GPT-3.5 of ChatGPT, Claude 2, and Bard—in October 2023, asking them to generate executable phenotyping algorithms in the form of SQL queries adhering to a common data model (CDM) for three phenotypes (ie, type 2 diabetes mellitus, dementia, and hypothyroidism). Three phenotyping experts evaluated the returned algorithms across several critical metrics. We further implemented the top-rated algorithms and compared them against clinician-validated phenotyping algorithms from the Electronic Medical Records and Genomics (eMERGE) network. Results GPT-4 and GPT-3.5 exhibited significantly higher overall expert evaluation scores in instruction following, algorithmic logic, and SQL executability, when compared to Claude 2 and Bard. Although GPT-4 and GPT-3.5 effectively identified relevant clinical concepts, they exhibited immature capability in organizing phenotyping criteria with the proper logic, leading to phenotyping algorithms that were either excessively restrictive (with low recall) or overly broad (with low positive predictive values). Conclusion GPT versions 3.5 and 4 are capable of drafting phenotyping algorithms by identifying relevant clinical criteria aligned with a CDM. However, expertise in informatics and clinical experience is still required to assess and further refine generated algorithms. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

11. Clinical risk prediction using language models: benefits and considerations.

Author: Acharya, Angeela, Shrestha, Sulabh, Chen, Anyi, Conte, Joseph, Avramovic, Sanja, Sikdar, Siddhartha, Anastasopoulos, Antonios, and Das, Sanmay
Abstract: Objective The use of electronic health records (EHRs) for clinical risk prediction is on the rise. However, in many practical settings, the limited availability of task-specific EHR data can restrict the application of standard machine learning pipelines. In this study, we investigate the potential of leveraging language models (LMs) as a means to incorporate supplementary domain knowledge for improving the performance of various EHR-based risk prediction tasks. Methods We propose two novel LM-based methods, namely "LLaMA2-EHR" and "Sent-e-Med." Our focus is on utilizing the textual descriptions within structured EHRs to make risk predictions about future diagnoses. We conduct a comprehensive comparison with previous approaches across various data types and sizes. Results Experiments across 6 different methods and 3 separate risk prediction tasks reveal that employing LMs to represent structured EHRs, such as diagnostic histories, results in significant performance improvements when evaluated using standard metrics such as area under the receiver operating characteristic (ROC) curve and precision-recall (PR) curve. Additionally, they offer benefits such as few-shot learning, the ability to handle previously unseen medical concepts, and adaptability to various medical vocabularies. However, it is noteworthy that outcomes may exhibit sensitivity to a specific prompt. Conclusion LMs encompass extensive embedded knowledge, making them valuable for the analysis of EHRs in the context of risk prediction. Nevertheless, it is important to exercise caution in their application, as ongoing safety concerns related to LMs persist and require continuous consideration. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

12. Association of physician burnout with perceived EHR work stress and potentially actionable factors.

Author: Tai-Seale, Ming, Baxter, Sally, Millen, Marlene, Çelebi, Julie, Polston, Gregory, Sun, Bryan, Gross, Erin, Helsten, Teresa, Rosen, Rebecca, Clay, Brian, Sinsky, Christine, Ziedonis, Douglas, Longhurst, Christopher, Savides, Thomas, Zisook, Sidney, and Cheung, Michael
Subjects: electronic health records, medical informatics, physicians, prescription drugs, professional burnout
Abstract: OBJECTIVE: Physicians of all specialties experienced unprecedented stressors during the COVID-19 pandemic, exacerbating preexisting burnout. We examine burnouts association with perceived and actionable electronic health record (EHR) workload factors and personal, professional, and organizational characteristics with the goal of identifying levers that can be targeted to address burnout. MATERIALS AND METHODS: Survey of physicians of all specialties in an academic health center, using a standard measure of burnout, self-reported EHR work stress, and EHR-based work assessed by the number of messages regarding prescription reauthorization and use of a staff pool to triage messages. Descriptive and multivariable regression analyses examined the relationship among burnout, perceived EHR work stress, and actionable EHR work factors. RESULTS: Of 1038 eligible physicians, 627 responded (60% response rate), 49.8% reported burnout symptoms. Logistic regression analysis suggests that higher odds of burnout are associated with physicians feeling higher level of EHR stress (odds ratio [OR], 1.15; 95% confidence interval [CI], 1.07-1.25), having more prescription reauthorization messages (OR, 1.23; 95% CI, 1.04-1.47), not feeling valued (OR, 3.38; 95% CI, 1.69-7.22) or aligned in values with clinic leaders (OR, 2.81; 95% CI, 1.87-4.27), in medical practice for ≤15 years (OR, 2.57; 95% CI, 1.63-4.12), and sleeping for
Published: 2023

13. Comparing penalization methods for linear models on large observational health data.

Author: Fridgeirsson, Egill A, Williams, Ross, Rijnbeek, Peter, Suchard, Marc A, and Reps, Jenna M
Abstract: Objective This study evaluates regularization variants in logistic regression (L1, L2, ElasticNet, Adaptive L1, Adaptive ElasticNet, Broken adaptive ridge [BAR], and Iterative hard thresholding [IHT]) for discrimination and calibration performance, focusing on both internal and external validation. Materials and Methods We use data from 5 US claims and electronic health record databases and develop models for various outcomes in a major depressive disorder patient population. We externally validate all models in the other databases. We use a train-test split of 75%/25% and evaluate performance with discrimination and calibration. Statistical analysis for difference in performance uses Friedman's test and critical difference diagrams. Results Of the 840 models we develop, L1 and ElasticNet emerge as superior in both internal and external discrimination, with a notable AUC difference. BAR and IHT show the best internal calibration, without a clear external calibration leader. ElasticNet typically has larger model sizes than L1. Methods like IHT and BAR, while slightly less discriminative, significantly reduce model complexity. Conclusion L1 and ElasticNet offer the best discriminative performance in logistic regression for healthcare predictions, maintaining robustness across validations. For simpler, more interpretable models, L0-based methods (IHT and BAR) are advantageous, providing greater parsimony and calibration with fewer features. This study aids in selecting suitable regularization techniques for healthcare prediction models, balancing performance, complexity, and interpretability. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

14. To weight or not to weight? The effect of selection bias in 3 large electronic health record-linked biobanks and recommendations for practice.

Author: Salvatore, Maxwell, Kundu, Ritoban, Shi, Xu, Friese, Christopher R, Lee, Seunggeun, Fritsche, Lars G, Mondul, Alison M, Hanauer, David, Pearce, Celeste Leigh, and Mukherjee, Bhramar
Abstract: Objectives To develop recommendations regarding the use of weights to reduce selection bias for commonly performed analyses using electronic health record (EHR)-linked biobank data. Materials and methods We mapped diagnosis (ICD code) data to standardized phecodes from 3 EHR-linked biobanks with varying recruitment strategies: All of Us (AOU; n = 244 071), Michigan Genomics Initiative (MGI; n = 81 243), and UK Biobank (UKB; n = 401 167). Using 2019 National Health Interview Survey data, we constructed selection weights for AOU and MGI to represent the US adult population more. We used weights previously developed for UKB to represent the UKB-eligible population. We conducted 4 common analyses comparing unweighted and weighted results. Results For AOU and MGI, estimated phecode prevalences decreased after weighting (weighted-unweighted median phecode prevalence ratio [MPR]: 0.82 and 0.61), while UKB estimates increased (MPR: 1.06). Weighting minimally impacted latent phenome dimensionality estimation. Comparing weighted versus unweighted phenome-wide association study for colorectal cancer, the strongest associations remained unaltered, with considerable overlap in significant hits. Weighting affected the estimated log-odds ratio for sex and colorectal cancer to align more closely with national registry-based estimates. Discussion Weighting had a limited impact on dimensionality estimation and large-scale hypothesis testing but impacted prevalence and association estimation. When interested in estimating effect size, specific signals from untargeted association analyses should be followed up by weighted analysis. Conclusion EHR-linked biobanks should report recruitment and selection mechanisms and provide selection weights with defined target populations. Researchers should consider their intended estimands, specify source and target populations, and weight EHR-linked biobank analyses accordingly. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

15. Development of a multimodal geomarker pipeline to assess the impact of social, economic, and environmental factors on pediatric health outcomes.

Author: Manning, Erika Rasnick, Duan, Qing, Taylor, Stuart, Ray, Sarah, Corley, Alexandra M S, Michael, Joseph, Gillette, Ryan, Unaka, Ndidi, Hartley, David, Beck, Andrew F, Brokamp, Cole, and Team, RISEUP Research
Abstract: Objectives We sought to create a computational pipeline for attaching geomarkers, contextual or geographic measures that influence or predict health, to electronic health records at scale, including developing a tool for matching addresses to parcels to assess the impact of housing characteristics on pediatric health. Materials and Methods We created a geomarker pipeline to link residential addresses from hospital admissions at Cincinnati Children's Hospital Medical Center (CCHMC) between July 2016 and June 2022 to place-based data. Linkage methods included by date of admission, geocoding to census tract, street range geocoding, and probabilistic address matching. We assessed 4 methods for probabilistic address matching. Results We characterized 124 244 hospitalizations experienced by 69 842 children admitted to CCHMC. Of the 55 684 hospitalizations with residential addresses in Hamilton County, Ohio, all were matched to 7 temporal geomarkers, 97% were matched to 79 census tract-level geomarkers and 13 point-level geomarkers, and 75% were matched to 16 parcel-level geomarkers. Parcel-level geomarkers were linked using our exact address matching tool developed using the best-performing linkage method. Discussion Our multimodal geomarker pipeline provides a reproducible framework for attaching place-based data to health data while maintaining data privacy. This framework can be applied to other populations and in other regions. We also created a tool for address matching that democratizes parcel-level data to advance precision population health efforts. Conclusion We created an open framework for multimodal geomarker assessment by harmonizing and linking a set of over 100 geomarkers to hospitalization data, enabling assessment of links between geomarkers and hospital admissions. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

16. Bottom-up and top-down paradigms of artificial intelligence research approaches to healthcare data science using growing real-world big data

Author: Wang, Michelle, Sushil, Madhumita, Miao, Brenda Y, and Butte, Atul J
Subjects: Information and Computing Sciences, Artificial Intelligence, Networking and Information Technology R&D (NITRD), Good Health and Well Being, Humans, Data Science, Big Data, Delivery of Health Care, Physicians, artificial intelligence computational methods, real-world data, electronic health records, Engineering, Medical and Health Sciences, Medical Informatics, Biomedical and clinical sciences, Health sciences, Information and computing sciences
Abstract: ObjectivesAs the real-world electronic health record (EHR) data continue to grow exponentially, novel methodologies involving artificial intelligence (AI) are becoming increasingly applied to enable efficient data-driven learning and, ultimately, to advance healthcare. Our objective is to provide readers with an understanding of evolving computational methods and help in deciding on methods to pursue.Target audienceThe sheer diversity of existing methods presents a challenge for health scientists who are beginning to apply computational methods to their research. Therefore, this tutorial is aimed at scientists working with EHR data who are early entrants into the field of applying AI methodologies.ScopeThis manuscript describes the diverse and growing AI research approaches in healthcare data science and categorizes them into 2 distinct paradigms, the bottom-up and top-down paradigms to provide health scientists venturing into artificial intelligent research with an understanding of the evolving computational methods and help in deciding on methods to pursue through the lens of real-world healthcare data.
Published: 2023

17. Partner-developed electronic health record tools to facilitate social risk-informed care planning.

Author: Gunn, Rose, Pisciotta, Maura, Gold, Rachel, Bunce, Arwen, Dambrun, Katie, Cottrell, Erika, Hessler, Danielle, Middendorf, Mary, Alvarez, Miguel, Giles, Lydia, and Gottlieb, Laura
Subjects: diabetes mellitus, health information technology, hypertension, participatory design, social determinants of health, social risk, Humans, Electronic Health Records, Social Support, Community Health Centers, Patient Care Planning, Documentation
Abstract: OBJECTIVE: Increased social risk data collection in health care settings presents new opportunities to apply this information to improve patient outcomes. Clinical decision support (CDS) tools can support these applications. We conducted a participatory engagement process to develop electronic health record (EHR)-based CDS tools to facilitate social risk-informed care plan adjustments in community health centers (CHCs). MATERIALS AND METHODS: We identified potential care plan adaptations through systematic reviews of hypertension and diabetes clinical guidelines. The results were used to inform an engagement process in which CHC staff and patients provided feedback on potential adjustments identified in the guideline reviews and on tool form and functions that could help CHC teams implement these suggested adjustments for patients with social risks. RESULTS: Partners universally prioritized tools for social risk screening and documentation. Additional high-priority content included adjusting medication costs and changing follow-up plans based on reported social risks. Most content recommendations reflected partners interests in encouraging provider-patient dialogue about care plan adaptations specific to patients social needs. Partners recommended CDS tool functions such as alerts and shortcuts to facilitate and efficiently document social risk-informed care plan adjustments. DISCUSSION AND CONCLUSION: CDS tools were designed to support CHC providers and staff to more consistently tailor care based on information about patients social context and thereby enhance patients ability to adhere to care plans. While such adjustments occur on an ad hoc basis in many care settings, these are among the first tools designed both to systematize and document these activities.
Published: 2023

18. Distinct components of alert fatigue in physicians’ responses to a noninterruptive clinical decision support alert

Author: Murad, Douglas A, Tsugawa, Yusuke, Elashoff, David A, Baldwin, Kevin M, and Bell, Douglas S
Subjects: Information and Computing Sciences, Health Services and Systems, Information Systems, Health Sciences, Clinical Research, Health Services, Depression, Mental Health, Good Health and Well Being, Humans, Decision Support Systems, Clinical, Medical Order Entry Systems, Physicians, Electronic Health Records, alert, physicians, depression, clinical decision support, alert fatigue, regression modeling, Engineering, Medical and Health Sciences, Medical Informatics, Biomedical and clinical sciences, Health sciences, Information and computing sciences
Abstract: ObjectiveClinical decision support (CDS) alerts may improve health care quality but "alert fatigue" can reduce provider responsiveness. We analyzed how the introduction of competing alerts affected provider adherence to a single depression screening alert.Materials and methodsWe analyzed the audit data from all occurrences of a CDS alert at a large academic health system. For patients who screen positive for depression during ambulatory visits, a noninterruptive alert was presented, offering a number of relevant documentation actions. Alert adherence was defined as the selection of any option offered within the alert. We assessed the effect of competing clinical guidance alerts presented during the same encounter and the total of all CDS alerts that the same provider had seen in the prior 90 days, on the probability of depression screen alert adherence, adjusting for physician and patient characteristics.ResultsThe depression alert fired during 55 649 office visits involving 418 physicians and 40 474 patients over 41 months. After adjustment, physicians who had seen the most alerts in the prior 90 days were much less likely to respond (adjusted OR highest-lowest quartile, 0.38; 95% CI 0.35-0.42; P
Published: 2022

19. Centralized Interactive Phenomics Resource: an integrated online phenomics knowledgebase for health data users.

Author: Honerlaw, Jacqueline, Ho, Yuk-Lam, Fontin, Francesca, Murray, Michael, Galloway, Ashley, Heise, David, Connatser, Keith, Davies, Laura, Gosian, Jeffrey, Maripuri, Monika, Russo, John, Sangar, Rahul, Tanukonda, Vidisha, Zielinski, Edward, Dubreuil, Maureen, Zimolzak, Andrew J, Panickan, Vidul A, Cheng, Su-Chun, Whitbourne, Stacey B, and Gagnon, David R
Abstract: Objective Development of clinical phenotypes from electronic health records (EHRs) can be resource intensive. Several phenotype libraries have been created to facilitate reuse of definitions. However, these platforms vary in target audience and utility. We describe the development of the Centralized Interactive Phenomics Resource (CIPHER) knowledgebase, a comprehensive public-facing phenotype library, which aims to facilitate clinical and health services research. Materials and Methods The platform was designed to collect and catalog EHR-based computable phenotype algorithms from any healthcare system, scale metadata management, facilitate phenotype discovery, and allow for integration of tools and user workflows. Phenomics experts were engaged in the development and testing of the site. Results The knowledgebase stores phenotype metadata using the CIPHER standard, and definitions are accessible through complex searching. Phenotypes are contributed to the knowledgebase via webform, allowing metadata validation. Data visualization tools linking to the knowledgebase enhance user interaction with content and accelerate phenotype development. Discussion The CIPHER knowledgebase was developed in the largest healthcare system in the United States and piloted with external partners. The design of the CIPHER website supports a variety of front-end tools and features to facilitate phenotype development and reuse. Health data users are encouraged to contribute their algorithms to the knowledgebase for wider dissemination to the research community, and to use the platform as a springboard for phenotyping. Conclusion CIPHER is a public resource for all health data users available at https://phenomics.va.ornl.gov/ which facilitates phenotype reuse, development, and dissemination of phenotyping knowledge. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

20. Stressful life events in electronic health records: a scoping review.

Author: Scherbakov, Dmitry, Mollalo, Abolfazl, and Lenert, Leslie
Abstract: Objectives Stressful life events, such as going through divorce, can have an important impact on human health. However, there are challenges in capturing these events in electronic health records (EHR). We conducted a scoping review aimed to answer 2 major questions: how stressful life events are documented in EHR and how they are utilized in research and clinical care. Materials and Methods Three online databases (EBSCOhost platform, PubMed, and Scopus) were searched to identify papers that included information on stressful life events in EHR; paper titles and abstracts were reviewed for relevance by 2 independent reviewers. Results Five hundred fifty-seven unique papers were retrieved, and of these 70 were eligible for data extraction. Most articles (n = 36, 51.4%) were focused on the statistical association between one or several stressful life events and health outcomes, followed by clinical utility (n = 15, 21.4%), extraction of events from free-text notes (n = 12, 17.1%), discussing privacy and other issues of storing life events (n = 5, 7.1%), and new EHR features related to life events (n = 4, 5.7%). The most frequently mentioned stressful life events in the publications were child abuse/neglect, arrest/legal issues, and divorce/relationship breakup. Almost half of the papers (n = 7, 46.7%) that analyzed clinical utility of stressful events were focused on decision support systems for child abuse, while others (n = 7, 46.7%) were discussing interventions related to social determinants of health in general. Discussion and Conclusions Few citations are available on the prevalence and use of stressful life events in EHR reflecting challenges in screening and storing of stressful life events. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

21. Comparison of phenomic profiles in the All of Us Research Program against the US general population and the UK Biobank.

Author: Zeng, Chenjie, Schlueter, David J, Tran, Tam C, Babbar, Anav, Cassini, Thomas, Bastarache, Lisa A, and Denny, Josh C
Abstract: Importance Knowledge gained from cohort studies has dramatically advanced both public and precision health. The All of Us Research Program seeks to enroll 1 million diverse participants who share multiple sources of data, providing unique opportunities for research. It is important to understand the phenomic profiles of its participants to conduct research in this cohort. Objectives More than 280 000 participants have shared their electronic health records (EHRs) in the All of Us Research Program. We aim to understand the phenomic profiles of this cohort through comparisons with those in the US general population and a well-established nation-wide cohort, UK Biobank, and to test whether association results of selected commonly studied diseases in the All of Us cohort were comparable to those in UK Biobank. Materials and Methods We included participants with EHRs in All of Us and participants with health records from UK Biobank. The estimates of prevalence of diseases in the US general population were obtained from the Global Burden of Diseases (GBD) study. We conducted phenome-wide association studies (PheWAS) of 9 commonly studied diseases in both cohorts. Results This study included 287 012 participants from the All of Us EHR cohort and 502 477 participants from the UK Biobank. A total of 314 diseases curated by the GBD were evaluated in All of Us , 80.9% (N = 254) of which were more common in All of Us than in the US general population [prevalence ratio (PR) >1.1, P < 2 × 10−5]. Among 2515 diseases and phenotypes evaluated in both All of Us and UK Biobank, 85.6% (N = 2152) were more common in All of Us (PR >1.1, P < 2 × 10−5). The Pearson correlation coefficients of effect sizes from PheWAS between All of Us and UK Biobank were 0.61, 0.50, 0.60, 0.57, 0.40, 0.53, 0.46, 0.47, and 0.24 for ischemic heart diseases, lung cancer, chronic obstructive pulmonary disease, dementia, colorectal cancer, lower back pain, multiple sclerosis, lupus, and cystic fibrosis, respectively. Discussion Despite the differences in prevalence of diseases in All of Us compared to the US general population or the UK Biobank, our study supports that All of Us can facilitate rapid investigation of a broad range of diseases. Conclusion Most diseases were more common in All of Us than in the general US population or the UK Biobank. Results of disease-disease association tests from All of Us are comparable to those estimated in another well-studied national cohort. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

22. Implementation of an electronic health record-integrated instant messaging system in an academic health system.

Author: Kwan, Brian, Bell, John F, Longhurst, Christopher A, Goldhaber, Nicole H, and Clay, Brian
Abstract: Objectives Effective communication amongst healthcare workers simultaneously promotes optimal patient outcomes when present and is deleterious to outcomes when absent. The advent of electronic health record (EHR)-embedded secure instantaneous messaging systems has provided a new conduit for provider communication. This manuscript describes the experience of one academic medical center with deployment of one such system (Secure Chat). Methods Data were collected on Secure Chat message volume from June 2017 to April 2023. Significant perideployment events were reviewed chronologically. Results After the first coronavirus disease 2019 lockdown in March 2020, messaging use increased by over 25 000 messages per month, with 1.2 million messages sent monthly by April 2023. Comparative features of current communication modalities in healthcare were summarized, highlighting the many advantages of Secure Chat. Conclusions While EHR-embedded secure instantaneous messaging systems represent a novel and potentially valuable communication medium in healthcare, generally agreed-upon best practices for their implementation are, as of yet, undetermined. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

23. Structured and unstructured social risk factor documentation in the electronic health record underestimates patients' self-reported risks.

Author: Iott, Bradley E, Rivas, Samantha, Gottlieb, Laura M, Adler-Milstein, Julia, and Pantell, Matthew S
Abstract: Objectives National attention has focused on increasing clinicians' responsiveness to the social determinants of health, for example, food security. A key step toward designing responsive interventions includes ensuring that information about patients' social circumstances is captured in the electronic health record (EHR). While prior work has assessed levels of EHR "social risk" documentation, the extent to which documentation represents the true prevalence of social risk is unknown. While no gold standard exists to definitively characterize social risks in clinical populations, here we used the best available proxy: social risks reported by patient survey. Materials and Methods We compared survey results to respondents' EHR social risk documentation (clinical free-text notes and International Statistical Classification of Diseases and Related Health Problems [ICD-10] codes). Results Surveys indicated much higher rates of social risk (8.2%-40.9%) than found in structured (0%-2.0%) or unstructured (0%-0.2%) documentation. Discussion Ideally, new care standards that include incentives to screen for social risk will increase the use of documentation tools and clinical teams' awareness of and interventions related to social adversity, while balancing potential screening and documentation burden on clinicians and patients. Conclusion EHR documentation of social risk factors currently underestimates their prevalence. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

24. Data-driven automated classification algorithms for acute health conditions: applying PheNorm to COVID-19 disease.

Author: Smith, Joshua C, Williamson, Brian D, Cronkite, David J, Park, Daniel, Whitaker, Jill M, McLemore, Michael F, Osmanski, Joshua T, Winter, Robert, Ramaprasan, Arvind, Kelley, Ann, Shea, Mary, Wittayanukorn, Saranrat, Stojanovic, Danijela, Zhao, Yueqin, Toh, Sengwee, Johnson, Kevin B, Aronoff, David M, and Carrell, David S
Abstract: Objectives Automated phenotyping algorithms can reduce development time and operator dependence compared to manually developed algorithms. One such approach, PheNorm, has performed well for identifying chronic health conditions, but its performance for acute conditions is largely unknown. Herein, we implement and evaluate PheNorm applied to symptomatic COVID-19 disease to investigate its potential feasibility for rapid phenotyping of acute health conditions. Materials and methods PheNorm is a general-purpose automated approach to creating computable phenotype algorithms based on natural language processing, machine learning, and (low cost) silver-standard training labels. We applied PheNorm to cohorts of potential COVID-19 patients from 2 institutions and used gold-standard manual chart review data to investigate the impact on performance of alternative feature engineering options and implementing externally trained models without local retraining. Results Models at each institution achieved AUC, sensitivity, and positive predictive value of 0.853, 0.879, 0.851 and 0.804, 0.976, and 0.885, respectively, at quantiles of model-predicted risk that maximize F1. We report performance metrics for all combinations of silver labels, feature engineering options, and models trained internally versus externally. Discussion Phenotyping algorithms developed using PheNorm performed well at both institutions. Performance varied with different silver-standard labels and feature engineering options. Models developed locally at one site also worked well when implemented externally at the other site. Conclusion PheNorm models successfully identified an acute health condition, symptomatic COVID-19. The simplicity of the PheNorm approach allows it to be applied at multiple study sites with substantially reduced overhead compared to traditional approaches. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

25. Electronic health record-supported implementation of an evidence-based pathway for perioperative surgical care.

Author: Wu, JunBo, Yuan, Christina T, Moyal-Smith, Rachel, Wick, Elizabeth C, and Rosen, Michael A
Abstract: Objectives Enhanced recovery pathways (ERPs) are evidence-based approaches to improving perioperative surgical care. However, the role of electronic health records (EHRs) in their implementation is unclear. We examine how EHRs facilitate or hinder ERP implementation. Materials and Methods We conducted interviews with informaticians and clinicians from US hospitals participating in an ERP implementation collaborative. We used inductive thematic analysis to analyze transcripts and categorized hospitals into 3 groups based on process measure adherence. High performers exhibited a minimum 80% adherence to 6 of 9 metrics, high improvers demonstrated significantly better adherence over 12 months, and strivers included all others. We mapped interrelationships between themes using causal loop diagrams. Results We interviewed 168 participants from 8 hospitals and found 3 thematic clusters: (1) "EHR difficulties" with the technology itself and contextual factors related to (2) "EHR enablers," and (3) "EHR barriers" in ERP implementation. Although all hospitals experienced issues, high performers and improvers successfully integrated ERPs into EHRs through a dedicated multidisciplinary team with informatics expertise. Strivers, while enacting some fixes, were unable to overcome individual resistance to EHR-supported ERPs. Discussion and Conclusion We add to the literature describing the limitations of EHRs' technological capabilities to facilitate clinical workflows. We illustrate how organizational strategies around engaging motivated clinical teams with informatics training and resources, especially with dedicated technical support, moderate the extent of EHRs' support to ERP implementation, causing downstream effects for hospitals to transform technological challenges into care-improving opportunities. Early and consistent involvement of informatics expertise with frontline EHR clinician users benefited the efficiency and effectiveness of ERP implementation and sustainability. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

26. Digital literacy in undergraduate pharmacy education: a scoping review.

Author: Alowais, Mashael, Rudd, Georgina, Besa, Victoria, Nazar, Hamde, Shah, Tejal, and Tolley, Clare
Abstract: Objectives Conduct a scoping review to identify the approaches used to integrate digital literacy into undergraduate pharmacy programs across different countries, focusing on methods for education, training, and assessment. Materials and methods Following the Joanna Briggs Institute methodology, we searched 5 electronic databases in June 2022: MEDLINE (Ovid), PubMed, Embase, Scopus, and CINAHL. Three independent reviewers screened all articles; data extraction was conducted by 2 reviewers. Any discrepancies were arbitrated by 2 additional reviewers. Results Out of 624 articles, 57 were included in this review. Educational and training approaches for digital literacy in undergraduate pharmacy programs encompassed a theoretical understanding of health informatics, familiarization with diverse digital technologies, and applied informatics in 2 domains: patient-centric care through digital technologies, and the utilization of digital technologies in interprofessional collaboration. Blended pedagogical strategies were commonly employed. Assessment approaches included patient plan development requiring digital information retrieval, critical appraisal of digital tools, live evaluations of telehealth skills, and quizzes and exams on health informatics concepts. External engagement with system developers, suppliers, and other institutes supported successful digital literacy education. Discussion and conclusion This scoping review identifies various learning objectives, teaching, and assessment strategies to incorporate digital literacy in undergraduate pharmacy curricula. Recommendations include acknowledging the evolving digital health landscape, ensuring constructive alignment between learning objectives, teaching approach and assessments, co-development of digital literacy courses with stakeholders, and using standardized guidelines for reporting educational interventions. This study provides practical suggestions for enhancing digital literacy education in undergraduate pharmacy programs. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

27. Semi-supervised ROC analysis for reliable and streamlined evaluation of phenotyping algorithms.

Author: Gao, Jianhui, Bonzel, Clara-Lea, Hong, Chuan, Varghese, Paul, Zakir, Karim, and Gronsbell, Jessica
Abstract: Objective High-throughput phenotyping will accelerate the use of electronic health records (EHRs) for translational research. A critical roadblock is the extensive medical supervision required for phenotyping algorithm (PA) estimation and evaluation. To address this challenge, numerous weakly-supervised learning methods have been proposed. However, there is a paucity of methods for reliably evaluating the predictive performance of PAs when a very small proportion of the data is labeled. To fill this gap, we introduce a semi-supervised approach (ssROC) for estimation of the receiver operating characteristic (ROC) parameters of PAs (eg, sensitivity, specificity). Materials and Methods ssROC uses a small labeled dataset to nonparametrically impute missing labels. The imputations are then used for ROC parameter estimation to yield more precise estimates of PA performance relative to classical supervised ROC analysis (supROC) using only labeled data. We evaluated ssROC with synthetic, semi-synthetic, and EHR data from Mass General Brigham (MGB). Results ssROC produced ROC parameter estimates with minimal bias and significantly lower variance than supROC in the simulated and semi-synthetic data. For the 5 PAs from MGB, the estimates from ssROC are 30% to 60% less variable than supROC on average. Discussion ssROC enables precise evaluation of PA performance without demanding large volumes of labeled data. ssROC is also easily implementable in open-source R software. Conclusion When used in conjunction with weakly-supervised PAs, ssROC facilitates the reliable and streamlined phenotyping necessary for EHR-based research. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

28. Generalizable pipeline for constructing HIV risk prediction models across electronic health record systems.

Author: May, Sarah B, Giordano, Thomas P, and Gottlieb, Assaf
Abstract: Objective The HIV epidemic remains a significant public health issue in the United States. HIV risk prediction models could be beneficial for reducing HIV transmission by helping clinicians identify patients at high risk for infection and refer them for testing. This would facilitate initiation on treatment for those unaware of their status and pre-exposure prophylaxis for those uninfected but at high risk. Existing HIV risk prediction algorithms rely on manual construction of features and are limited in their application across diverse electronic health record systems. Furthermore, the accuracy of these models in predicting HIV in females has thus far been limited. Materials and methods We devised a pipeline for automatic construction of prediction models based on automatic feature engineering to predict HIV risk and tested our pipeline on a local electronic health records system and a national claims data. We also compared the performance of general models to female-specific models. Results Our models obtain similarly good performance on both health record datasets despite difference in represented populations and data availability (AUC = 0.87). Furthermore, our general models obtain good performance on females but are also improved by constructing female-specific models (AUC between 0.81 and 0.86 across datasets). Discussion and conclusions We demonstrated that flexible construction of prediction models performs well on HIV risk prediction across diverse health records systems and perform as well in predicting HIV risk in females, making deployment of such models into existing health care systems tangible. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

29. Physician awareness of social determinants of health documentation capability in the electronic health record.

Author: Iott, Bradley, Pantell, Matthew, Adler-Milstein, Julia, and Gottlieb, Laura
Subjects: awareness, documentation, electronic health records, social determinants of health, Humans, Electronic Health Records, Social Determinants of Health, Documentation, Physicians, Community Health Centers
Abstract: Healthcare organizations are increasing social determinants of health (SDH) screening and documentation in the electronic health record (EHR). Physicians may use SDH data for medical decision-making and to provide referrals to social care resources. Physicians must be aware of these data to use them, however, and little is known about physicians awareness of EHR-based SDH documentation or documentation capabilities. We therefore leveraged national physician survey data to measure level of awareness and variation by physician, practice, and EHR characteristics to inform practice- and policy-based efforts to drive medical-social care integration. We identify higher levels of social needs documentation awareness among physicians practicing in community health centers, those participating in payment models with social care initiatives, and those aware of other advanced EHR functionalities. Findings indicate that there are opportunities to improve physician education and training around new EHR-based SDH functionalities.
Published: 2022

30. Inpatient nurses' preferences and decisions with risk information visualization.

Author: Jeffery, Alvin D, Reale, Carrie, Faiman, Janelle, Borkowski, Vera, Beebe, Russ, Matheny, Michael E, and Anders, Shilo
Abstract: Objective We examined the influence of 4 different risk information formats on inpatient nurses' preferences and decisions with an acute clinical deterioration decision-support system. Materials and methods We conducted a comparative usability evaluation in which participants provided responses to multiple user interface options in a simulated setting. We collected qualitative data using think aloud methods. We collected quantitative data by asking participants which action they would perform after each time point in 3 different patient scenarios. Results More participants (n = 6) preferred the probability format over relative risk ratios (n = 2), absolute differences (n = 2), and number of persons out of 100 (n = 0). Participants liked average lines, having a trend graph to supplement the risk estimate, and consistent colors between trend graphs and possible actions. Participants did not like too much text information or the presence of confidence intervals. From a decision-making perspective, use of the probability format was associated with greater concordance in actions taken by participants compared to the other 3 risk information formats. Discussion By focusing on nurses' preferences and decisions with several risk information display formats and collecting both qualitative and quantitative data, we have provided meaningful insights for the design of clinical decision-support systems containing complex quantitative information. Conclusion This study adds to our knowledge of presenting risk information to nurses within clinical decision-support systems. We encourage those developing risk-based systems for inpatient nurses to consider expressing risk in a probability format and include a graph (with average line) to display the patient's recent trends. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

31. The role of health system penetration rate in estimating the prevalence of type 1 diabetes in children and adolescents using electronic health records.

Author: Li, Piaopiao, Lyu, Tianchen, Alkhuzam, Khalid, Spector, Eliot, Donahoo, William T, Bost, Sarah, Wu, Yonghui, Hogan, William R, Prosperi, Mattia, Schatz, Desmond A, Atkinson, Mark A, Haller, Michael J, Shenkman, Elizabeth A, Guo, Yi, Bian, Jiang, and Shao, Hui
Abstract: Objective Having sufficient population coverage from the electronic health records (EHRs)-connected health system is essential for building a comprehensive EHR-based diabetes surveillance system. This study aimed to establish an EHR-based type 1 diabetes (T1D) surveillance system for children and adolescents across racial and ethnic groups by identifying the minimum population coverage from EHR-connected health systems to accurately estimate T1D prevalence. Materials and methods We conducted a retrospective, cross-sectional analysis involving children and adolescents <20 years old identified from the OneFlorida+ Clinical Research Network (2018-2020). T1D cases were identified using a previously validated computable phenotyping algorithm. The T1D prevalence for each ZIP Code Tabulation Area (ZCTA, 5 digits), defined as the number of T1D cases divided by the total number of residents in the corresponding ZCTA, was calculated. Population coverage for each ZCTA was measured using observed health system penetration rates (HSPR), which was calculated as the ratio of residents in the corresponding ZTCA and captured by OneFlorida+ to the overall population in the same ZCTA reported by the Census. We used a recursive partitioning algorithm to identify the minimum required observed HSPR to estimate T1D prevalence and compare our estimate with the reported T1D prevalence from the SEARCH study. Results Observed HSPRs of 55%, 55%, and 60% were identified as the minimum thresholds for the non-Hispanic White, non-Hispanic Black, and Hispanic populations. The estimated T1D prevalence for non-Hispanic White and non-Hispanic Black were 2.87 and 2.29 per 1000 youth, which are comparable to the reference study's estimation. The estimated prevalence of T1D for Hispanics (2.76 per 1000 youth) was higher than the reference study's estimation (1.48-1.64 per 1000 youth). The standardized T1D prevalence in the overall Florida population was 2.81 per 1000 youth in 2019. Conclusion Our study provides a method to estimate T1D prevalence in children and adolescents using EHRs and reports the estimated HSPRs and prevalence of T1D for different race and ethnicity groups to facilitate EHR-based diabetes surveillance. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

32. The relationship between electronic health records user interface features and data quality of patient clinical information: an integrative review.

Author: Madandola, Olatunde O, Bjarnadottir, Ragnhildur I, Yao, Yingwei, Ansell, Margaret, Santos, Fabiana Dos, Cho, Hwayoung, Lopez, Karen Dunn, Macieira, Tamara G R, and Keenan, Gail M
Abstract: Objectives Electronic health records (EHRs) user interfaces (UI) designed for data entry can potentially impact the quality of patient information captured in the EHRs. This review identified and synthesized the literature evidence about the relationship of UI features in EHRs on data quality (DQ). Materials and methods We performed an integrative review of research studies by conducting a structured search in 5 databases completed on October 10, 2022. We applied Whittemore & Knafl's methodology to identify literature, extract, and synthesize information, iteratively. We adapted Kmet et al appraisal tool for the quality assessment of the evidence. The research protocol was registered with PROSPERO (CRD42020203998). Results Eleven studies met the inclusion criteria. The relationship between 1 or more UI features and 1 or more DQ indicators was examined. UI features were classified into 4 categories: 3 types of data capture aids, and other methods of DQ assessment at the UI. The Weiskopf et al measures were used to assess DQ: completeness (n = 10), correctness (n = 10), and currency (n = 3). UI features such as mandatory fields, templates, and contextual autocomplete improved completeness or correctness or both. Measures of currency were scarce. Discussion The paucity of studies on UI features and DQ underscored the limited knowledge in this important area. The UI features examined had both positive and negative effects on DQ. Standardization of data entry and further development of automated algorithmic aids, including adaptive UIs, have great promise for improving DQ. Further research is essential to ensure data captured in our electronic systems are high quality and valid for use in clinical decision-making and other secondary analyses. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

33. Transportability of bacterial infection prediction models for critically ill patients.

Author: Eickelberg, Garrett, Sanchez-Pinto, Lazaro Nelson, Kline, Adrienne Sarah, and Luo, Yuan
Abstract: Objective Bacterial infections (BIs) are common, costly, and potentially life-threatening in critically ill patients. Patients with suspected BIs may require empiric multidrug antibiotic regimens and therefore potentially be exposed to prolonged and unnecessary antibiotics. We previously developed a BI risk model to augment practices and help shorten the duration of unnecessary antibiotics to improve patient outcomes. Here, we have performed a transportability assessment of this BI risk model in 2 tertiary intensive care unit (ICU) settings and a community ICU setting. We additionally explored how simple multisite learning techniques impacted model transportability. Methods Patients suspected of having a community-acquired BI were identified in 3 datasets: Medical Information Mart for Intensive Care III (MIMIC), Northwestern Medicine Tertiary (NM-T) ICUs, and NM "community-based" ICUs. ICU encounters from MIMIC and NM-T datasets were split into 70/30 train and test sets. Models developed on training data were evaluated against the NM-T and MIMIC test sets, as well as NM community validation data. Results During internal validations, models achieved AUROCs of 0.78 (MIMIC) and 0.81 (NM-T) and were well calibrated. In the external community ICU validation, the NM-T model had robust transportability (AUROC 0.81) while the MIMIC model transported less favorably (AUROC 0.74), likely due to case-mix differences. Multisite learning provided no significant discrimination benefit in internal validation studies but offered more stability during transport across all evaluation datasets. Discussion These results suggest that our BI risk models maintain predictive utility when transported to external cohorts. Conclusion Our findings highlight the importance of performing external model validation on myriad clinically relevant populations prior to implementation. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

34. Deep sequential neural network models improve stratification of suicide attempt risk among US veterans.

Author: Martinez, Carianne, Levin, Drew, Jones, Jessica, Finley, Patrick D, McMahon, Benjamin, Dhaubhadel, Sayera, Cohn, Judith, Program, Million Veteran, Workgroup, MVP Suicide Exemplar, Oslin, David W, Kimbrel, Nathan A, and Beckham, Jean C
Abstract: Objective To apply deep neural networks (DNNs) to longitudinal EHR data in order to predict suicide attempt risk among veterans. Local explainability techniques were used to provide explanations for each prediction with the goal of ultimately improving outreach and intervention efforts. Materials and methods The DNNs fused demographic information with diagnostic, prescription, and procedure codes. Models were trained and tested on EHR data of approximately 500 000 US veterans: all veterans with recorded suicide attempts from April 1, 2005, through January 1, 2016, each paired with 5 veterans of the same age who did not attempt suicide. Shapley Additive Explanation (SHAP) values were calculated to provide explanations of DNN predictions. Results The DNNs outperformed logistic and linear regression models in predicting suicide attempts. After adjusting for the sampling technique, the convolutional neural network (CNN) model achieved a positive predictive value (PPV) of 0.54 for suicide attempts within 12 months by veterans in the top 0.1% risk tier. Explainability methods identified meaningful subgroups of high-risk veterans as well as key determinants of suicide attempt risk at both the group and individual level. Discussion and conclusion The deep learning methods employed in the present study have the potential to significantly enhance existing suicide risk models for veterans. These methods can also provide important clues to explore the relative value of long-term and short-term intervention strategies. Furthermore, the explainability methods utilized here could also be used to communicate to clinicians the key features which increase specific veterans' risk for attempting suicide. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

35. Self-supervised machine learning using adult inpatient data produces effective models for pediatric clinical prediction tasks.

Author: Lemmon, Joshua, Guo, Lin Lawrence, Steinberg, Ethan, Morse, Keith E, Fleming, Scott Lanyon, Aftandilian, Catherine, Pfohl, Stephen R, Posada, Jose D, Shah, Nigam, Fries, Jason, and Sung, Lillian
Abstract: Objective Development of electronic health records (EHR)-based machine learning models for pediatric inpatients is challenged by limited training data. Self-supervised learning using adult data may be a promising approach to creating robust pediatric prediction models. The primary objective was to determine whether a self-supervised model trained in adult inpatients was noninferior to logistic regression models trained in pediatric inpatients, for pediatric inpatient clinical prediction tasks. Materials and Methods This retrospective cohort study used EHR data and included patients with at least one admission to an inpatient unit. One admission per patient was randomly selected. Adult inpatients were 18 years or older while pediatric inpatients were more than 28 days and less than 18 years. Admissions were temporally split into training (January 1, 2008 to December 31, 2019), validation (January 1, 2020 to December 31, 2020), and test (January 1, 2021 to August 1, 2022) sets. Primary comparison was a self-supervised model trained in adult inpatients versus count-based logistic regression models trained in pediatric inpatients. Primary outcome was mean area-under-the-receiver-operating-characteristic-curve (AUROC) for 11 distinct clinical outcomes. Models were evaluated in pediatric inpatients. Results When evaluated in pediatric inpatients, mean AUROC of self-supervised model trained in adult inpatients (0.902) was noninferior to count-based logistic regression models trained in pediatric inpatients (0.868) (mean difference = 0.034, 95% CI=0.014-0.057; P < .001 for noninferiority and P = .006 for superiority). Conclusions Self-supervised learning in adult inpatients was noninferior to logistic regression models trained in pediatric inpatients. This finding suggests transferability of self-supervised models trained in adult patients to pediatric patients, without requiring costly model retraining. [ABSTRACT FROM AUTHOR]
Published: 2023
Full Text: View/download PDF

36. Federated and distributed learning applications for electronic health records and structured medical data: a scoping review.

Author: Li, Siqi, Liu, Pinyan, Nascimento, Gustavo G, Wang, Xinru, Leite, Fabio Renato Manzolli, Chakraborty, Bibhas, Hong, Chuan, Ning, Yilin, Xie, Feng, Teo, Zhen Ling, Ting, Daniel Shu Wei, Haddadi, Hamed, Ong, Marcus Eng Hock, Peres, Marco Aurélio, and Liu, Nan
Abstract: Objectives Federated learning (FL) has gained popularity in clinical research in recent years to facilitate privacy-preserving collaboration. Structured data, one of the most prevalent forms of clinical data, has experienced significant growth in volume concurrently, notably with the widespread adoption of electronic health records in clinical practice. This review examines FL applications on structured medical data, identifies contemporary limitations, and discusses potential innovations. Materials and methods We searched 5 databases, SCOPUS, MEDLINE, Web of Science, Embase, and CINAHL, to identify articles that applied FL to structured medical data and reported results following the PRISMA guidelines. Each selected publication was evaluated from 3 primary perspectives, including data quality, modeling strategies, and FL frameworks. Results Out of the 1193 papers screened, 34 met the inclusion criteria, with each article consisting of one or more studies that used FL to handle structured clinical/medical data. Of these, 24 utilized data acquired from electronic health records, with clinical predictions and association studies being the most common clinical research tasks that FL was applied to. Only one article exclusively explored the vertical FL setting, while the remaining 33 explored the horizontal FL setting, with only 14 discussing comparisons between single-site (local) and FL (global) analysis. Conclusions The existing FL applications on structured medical data lack sufficient evaluations of clinically meaningful benefits, particularly when compared to single-site analyses. Therefore, it is crucial for future FL applications to prioritize clinical motivations and develop designs and methodologies that can effectively support and aid clinical practice and research. [ABSTRACT FROM AUTHOR]
Published: 2023
Full Text: View/download PDF

37. A broadly applicable approach to enrich electronic-health-record cohorts by identifying patients with complete data: a multisite evaluation.

Author: Klann, Jeffrey G, Henderson, Darren W, Morris, Michele, Estiri, Hossein, Weber, Griffin M, Visweswaran, Shyam, and Murphy, Shawn N
Abstract: Objective Patients who receive most care within a single healthcare system (colloquially called a "loyalty cohort" since they typically return to the same providers) have mostly complete data within that organization's electronic health record (EHR). Loyalty cohorts have low data missingness, which can unintentionally bias research results. Using proxies of routine care and healthcare utilization metrics, we compute a per-patient score that identifies a loyalty cohort. Materials and Methods We implemented a computable program for the widely adopted i2b2 platform that identifies loyalty cohorts in EHRs based on a machine-learning model, which was previously validated using linked claims data. We developed a novel validation approach, which tests, using only EHR data, whether patients returned to the same healthcare system after the training period. We evaluated these tools at 3 institutions using data from 2017 to 2019. Results Loyalty cohort calculations to identify patients who returned during a 1-year follow-up yielded a mean area under the receiver operating characteristic curve of 0.77 using the original model and 0.80 after calibrating the model at individual sites. Factors such as multiple medications or visits contributed significantly at all sites. Screening tests' contributions (eg, colonoscopy) varied across sites, likely due to coding and population differences. Discussion This open-source implementation of a "loyalty score" algorithm had good predictive power. Enriching research cohorts by utilizing these low-missingness patients is a way to obtain the data completeness necessary for accurate causal analysis. Conclusion i2b2 sites can use this approach to select cohorts with mostly complete EHR data. [ABSTRACT FROM AUTHOR]
Published: 2023
Full Text: View/download PDF

38. Prediction models using artificial intelligence and longitudinal data from electronic health records: a systematic methodological review.

Author: Carrasco-Ribelles, Lucía A, Llanes-Jurado, José, Gallego-Moll, Carlos, Cabrera-Bean, Margarita, Monteagudo-Zaragoza, Mònica, Violán, Concepción, and Zabaleta-del-Olmo, Edurne
Abstract: Objective To describe and appraise the use of artificial intelligence (AI) techniques that can cope with longitudinal data from electronic health records (EHRs) to predict health-related outcomes. Methods This review included studies in any language that: EHR was at least one of the data sources, collected longitudinal data, used an AI technique capable of handling longitudinal data, and predicted any health-related outcomes. We searched MEDLINE, Scopus, Web of Science, and IEEE Xplorer from inception to January 3, 2022. Information on the dataset, prediction task, data preprocessing, feature selection, method, validation, performance, and implementation was extracted and summarized using descriptive statistics. Risk of bias and completeness of reporting were assessed using a short form of PROBAST and TRIPOD, respectively. Results Eighty-one studies were included. Follow-up time and number of registers per patient varied greatly, and most predicted disease development or next event based on diagnoses and drug treatments. Architectures generally were based on Recurrent Neural Networks-like layers, though in recent years combining different layers or transformers has become more popular. About half of the included studies performed hyperparameter tuning and used attention mechanisms. Most performed a single train-test partition and could not correctly assess the variability of the model's performance. Reporting quality was poor, and a third of the studies were at high risk of bias. Conclusions AI models are increasingly using longitudinal data. However, the heterogeneity in reporting methodology and results, and the lack of public EHR datasets and code sharing, complicate the possibility of replication. Registration PROSPERO database (CRD42022331388). [ABSTRACT FROM AUTHOR]
Published: 2023
Full Text: View/download PDF

39. The added value of text from Dutch general practitioner notes in predictive modeling.

Author: Seinen, Tom M, Kors, Jan A, Mulligen, Erik M van, Fridgeirsson, Egill, and Rijnbeek, Peter R
Abstract: Objective This work aims to explore the value of Dutch unstructured data, in combination with structured data, for the development of prognostic prediction models in a general practitioner (GP) setting. Materials and methods We trained and validated prediction models for 4 common clinical prediction problems using various sparse text representations, common prediction algorithms, and observational GP electronic health record (EHR) data. We trained and validated 84 models internally and externally on data from different EHR systems. Results On average, over all the different text representations and prediction algorithms, models only using text data performed better or similar to models using structured data alone in 2 prediction tasks. Additionally, in these 2 tasks, the combination of structured and text data outperformed models using structured or text data alone. No large performance differences were found between the different text representations and prediction algorithms. Discussion Our findings indicate that the use of unstructured data alone can result in well-performing prediction models for some clinical prediction problems. Furthermore, the performance improvement achieved by combining structured and text data highlights the added value. Additionally, we demonstrate the significance of clinical natural language processing research in languages other than English and the possibility of validating text-based prediction models across various EHR systems. Conclusion Our study highlights the potential benefits of incorporating unstructured data in clinical prediction models in a GP setting. Although the added value of unstructured data may vary depending on the specific prediction task, our findings suggest that it has the potential to enhance patient care. [ABSTRACT FROM AUTHOR]
Published: 2023
Full Text: View/download PDF

40. Electronic health records and clinical documentation in medical residency programs: preparing residents to become master clinicians.

Author: Anderson, Chad, Kaul, Mala, Gullapalli, Nageshwara, and Pitani, Sujatha
Abstract: Objective The ubiquity of electronic health records (EHRs) has made incorporating EHRs into medical practice an essential component of resident's training. Patient encounters, an important element of practice, are impacted by EHRs through factors that include increasing documentation requirements. This research sheds light on the role of EHRs on resident clinical skills development with emphasis on their role in patient encounters. Materials and Methods We conducted qualitative semistructured interviews with 32 residents and 13 clinic personnel at an internal medicine residency program in a western US medical school focusing on the resident's clinic rotation. Results Residents were learning to use the EHR to support and enhance their patient encounters, but one factor making that more challenging for many was the need to address quality measures. Quality measures could shift attention away from the primary reason for the encounter and addressing them consumed time that could have been spent diagnosing and treating the patient's chief complaint. A willingness to learn on-the-job by asking questions was important for resident development in using the EHR to support their work and improve their clinical skills. Discussion Creating a culture where residents seek guidance on how to use the EHR and incorporate it into their work will support residents on their journey to become master clinicians. Shifting some documentation to the patient and other clinicians may also be necessary to keep from overburdening residents. Conclusion Residency programs must support residents as they develop their clinical skills to practice in a world where EHRs are ubiquitous. [ABSTRACT FROM AUTHOR]
Published: 2023
Full Text: View/download PDF

41. LeafAI: query generator for clinical cohort discovery rivaling a human programmer.

Author: Dobbins, Nicholas J, Han, Bin, Zhou, Weipeng, Lan, Kristine F, Kim, H Nina, Harrington, Robert, Uzuner, Özlem, and Yetisgen, Meliha
Abstract: Objective Identifying study-eligible patients within clinical databases is a critical step in clinical research. However, accurate query design typically requires extensive technical and biomedical expertise. We sought to create a system capable of generating data model-agnostic queries while also providing novel logical reasoning capabilities for complex clinical trial eligibility criteria. Materials and Methods The task of query creation from eligibility criteria requires solving several text-processing problems, including named entity recognition and relation extraction, sequence-to-sequence transformation, normalization, and reasoning. We incorporated hybrid deep learning and rule-based modules for these, as well as a knowledge base of the Unified Medical Language System (UMLS) and linked ontologies. To enable data-model agnostic query creation, we introduce a novel method for tagging database schema elements using UMLS concepts. To evaluate our system, called LeafAI, we compared the capability of LeafAI to a human database programmer to identify patients who had been enrolled in 8 clinical trials conducted at our institution. We measured performance by the number of actual enrolled patients matched by generated queries. Results LeafAI matched a mean 43% of enrolled patients with 27 225 eligible across 8 clinical trials, compared to 27% matched and 14 587 eligible in queries by a human database programmer. The human programmer spent 26 total hours crafting queries compared to several minutes by LeafAI. Conclusions Our work contributes a state-of-the-art data model-agnostic query generation system capable of conditional reasoning using a knowledge base. We demonstrate that LeafAI can rival an experienced human programmer in finding patients eligible for clinical trials. [ABSTRACT FROM AUTHOR]
Published: 2023
Full Text: View/download PDF

42. Embedding electronic health records onto a knowledge network recognizes prodromal features of multiple sclerosis and predicts diagnosis

Author: Nelson, Charlotte A, Bove, Riley, Butte, Atul J, and Baranzini, Sergio E
Subjects: Brain Disorders, Clinical Research, Neurodegenerative, Patient Safety, Neurosciences, Multiple Sclerosis, Networking and Information Technology R&D (NITRD), 8.4 Research design and methodologies (health services), Health and social care services research, Generic health relevance, Neurological, Good Health and Well Being, Algorithms, Electronic Health Records, Humans, Machine Learning, Precision Medicine, knowledge graph, electronic health records, multiple sclerosis, preventative medicine, Information and Computing Sciences, Engineering, Medical and Health Sciences, Medical Informatics
Abstract: ObjectiveEarly identification of chronic diseases is a pillar of precision medicine as it can lead to improved outcomes, reduction of disease burden, and lower healthcare costs. Predictions of a patient's health trajectory have been improved through the application of machine learning approaches to electronic health records (EHRs). However, these methods have traditionally relied on "black box" algorithms that can process large amounts of data but are unable to incorporate domain knowledge, thus limiting their predictive and explanatory power. Here, we present a method for incorporating domain knowledge into clinical classifications by embedding individual patient data into a biomedical knowledge graph.Materials and methodsA modified version of the Page rank algorithm was implemented to embed millions of deidentified EHRs into a biomedical knowledge graph (SPOKE). This resulted in high-dimensional, knowledge-guided patient health signatures (ie, SPOKEsigs) that were subsequently used as features in a random forest environment to classify patients at risk of developing a chronic disease.ResultsOur model predicted disease status of 5752 subjects 3 years before being diagnosed with multiple sclerosis (MS) (AUC = 0.83). SPOKEsigs outperformed predictions using EHRs alone, and the biological drivers of the classifiers provided insight into the underpinnings of prodromal MS.ConclusionUsing data from EHR as input, SPOKEsigs describe patients at both the clinical and biological levels. We provide a clinical use case for detecting MS up to 5 years prior to their documented diagnosis in the clinic and illustrate the biological features that distinguish the prodromal MS state.
Published: 2022

43. Documentation and review of social determinants of health data in the EHR: measures and associated insights

Author: Wang, Michael, Pantell, Matthew S, Gottlieb, Laura M, and Adler-Milstein, Julia
Subjects: Patient Safety, Good Health and Well Being, Documentation, Electronic Health Records, Humans, Social Determinants of Health, Surveys and Questionnaires, Workflow, social informatics, SDOH, EHR, Information and Computing Sciences, Engineering, Medical and Health Sciences, Medical Informatics
Abstract: ObjectiveElectronic Health Records (EHRs) increasingly include designated fields to capture social determinants of health (SDOH). We developed measures to characterize their use, and use of other SDOH data types, to optimize SDOH data integration.Materials and methodsWe developed 3 measures that accommodate different EHR data types on an encounter or patient-year basis. We implemented these measures-documented during encounter (DDE) captures documentation occurring during the encounter; documented by discharge (DBD) includes DDE plus documentation occurring any time prior to admission; and reviewed during encounter (RDE) captures whether anyone reviewed documented data-for the newly available structured SDOH fields and 4 other comparator SDOH data types (problem list, inpatient nursing question, social history free text, and social work notes) on a hospital encounter basis (with patient-year metrics in the Supplementary Appendix). Our sample included all patients (n = 27 127) with at least one hospitalization at UCSF Health (a large, urban, tertiary medical center) over a 1-year period.ResultsWe observed substantial variation in the use of different SDOH EHR data types. Notably, social history question fields (newly added at study period start) were rarely used (DDE: 0.03% of encounters, DBD: 0.26%, RDE: 0.03%). Free-text patient social history fields had higher use (DDE: 12.1%, DBD: 49.0%, RDE: 14.4%).DiscussionOur measures of real-world SDOH data use can guide current efforts to capture and leverage these data. For our institution, measures revealed substantial variation across data types, suggesting the need to engage in efforts such as EHR-user education and targeted workflow integration.ConclusionMeasures revealed opportunities to optimize SDOH data documentation and review.
Published: 2021

44. Nurses' preferences for the format of care planning clinical decision support coded with standardized nursing languages.

Author: Santos, Fabiana Cristina Dos, Yao, Yingwei, Macieira, Tamara G R, Lopez, Karen Dunn, and Keenan, Gail M
Abstract: Current electronic health records (EHRs) are often ineffective in identifying patient priorities and care needs requiring nurses to search a large volume of text to find clinically meaningful information. Our study, part of a larger randomized controlled trial testing nursing care planning clinical decision support coded in standardized nursing languages, focuses on identifying format preferences after random assignment and interaction to 1 of 3 formats (text only, text+table, text+graph). Being assigned to the text+graph significantly increased the preference for graph (P = .02) relative to other groups. Being assigned to the text only (P = .06) and text+table (P = .35) was not significantly associated with preference for their assigned formats. Additionally, the preference for graphs was not significantly associated with understanding graph content (P = .19). Further studies are needed to enhance our understanding of how format preferences influence the use and processing of displayed information. [ABSTRACT FROM AUTHOR]
Published: 2023
Full Text: View/download PDF

45. A call to action to improve the completeness of older adult sexual and gender minority data in electronic health records.

Author: May, Jennifer T, Myers, John, Noonan, Devon, McConnell, Eleanor, and Cary, Michael P
Abstract: Sexual and gender minority (SGM) older adults experience greater health disparities compared to non-SGM older adults. The SGM older adult population is growing rapidly. To address this disparity and gain a better understanding of their unique challenges in healthcare relies on accurate data collection. We conducted a secondary data analysis of 2018–2022 electronic health record data for older adults aged ≥50 years, in 1 large academic health system to determine the source, magnitude, and correlates of missing sexual orientation and gender identity (SOGI) data among hospitalized older adults. Among 153 827 older adults discharged from the hospital, SOGI data missingness was 67.6% for sexual orientation and 63.0% for gender identity. SOGI data are underreported, leading to bias findings when studying health disparities. Without complete SOGI data, healthcare systems will not fully understand the unique needs of SGM individuals and develop tailored interventions and programs to reduce health disparities among these populations. [ABSTRACT FROM AUTHOR]
Published: 2023
Full Text: View/download PDF

46. Predicting emergency department visits and hospitalizations for patients with heart failure in home healthcare using a time series risk model.

Author: Chae, Sena, Davoudi, Anahita, Song, Jiyoun, Evans, Lauren, Hobensack, Mollie, Bowles, Kathryn H, McDonald, Margaret V, Barrón, Yolanda, Rossetti, Sarah Collins, Cato, Kenrick, Sridharan, Sridevi, and Topaz, Maxim
Abstract: Objectives Little is known about proactive risk assessment concerning emergency department (ED) visits and hospitalizations in patients with heart failure (HF) who receive home healthcare (HHC) services. This study developed a time series risk model for predicting ED visits and hospitalizations in patients with HF using longitudinal electronic health record data. We also explored which data sources yield the best-performing models over various time windows. Materials and Methods We used data collected from 9362 patients from a large HHC agency. We iteratively developed risk models using both structured (eg, standard assessment tools, vital signs, visit characteristics) and unstructured data (eg, clinical notes). Seven specific sets of variables included: (1) the Outcome and Assessment Information Set, (2) vital signs, (3) visit characteristics, (4) rule-based natural language processing-derived variables, (5) term frequency-inverse document frequency variables, (6) Bio-Clinical Bidirectional Encoder Representations from Transformers variables, and (7) topic modeling. Risk models were developed for 18 time windows (1–15, 30, 45, and 60 days) before an ED visit or hospitalization. Risk prediction performances were compared using recall, precision, accuracy, F 1, and area under the receiver operating curve (AUC). Results The best-performing model was built using a combination of all 7 sets of variables and the time window of 4 days before an ED visit or hospitalization (AUC = 0.89 and F 1 = 0.69). Discussion and Conclusion This prediction model suggests that HHC clinicians can identify patients with HF at risk for visiting the ED or hospitalization within 4 days before the event, allowing for earlier targeted interventions. [ABSTRACT FROM AUTHOR]
Published: 2023
Full Text: View/download PDF

47. Electronic health record data quality assessment and tools: a systematic review.

Author: Lewis, Abigail E, Weiskopf, Nicole, Abrams, Zachary B, Foraker, Randi, Lai, Albert M, Payne, Philip R O, and Gupta, Aditi
Abstract: Objective We extended a 2013 literature review on electronic health record (EHR) data quality assessment approaches and tools to determine recent improvements or changes in EHR data quality assessment methodologies. Materials and Methods We completed a systematic review of PubMed articles from 2013 to April 2023 that discussed the quality assessment of EHR data. We screened and reviewed papers for the dimensions and methods defined in the original 2013 manuscript. We categorized papers as data quality outcomes of interest, tools, or opinion pieces. We abstracted and defined additional themes and methods though an iterative review process. Results We included 103 papers in the review, of which 73 were data quality outcomes of interest papers, 22 were tools, and 8 were opinion pieces. The most common dimension of data quality assessed was completeness, followed by correctness, concordance, plausibility, and currency. We abstracted conformance and bias as 2 additional dimensions of data quality and structural agreement as an additional methodology. Discussion There has been an increase in EHR data quality assessment publications since the original 2013 review. Consistent dimensions of EHR data quality continue to be assessed across applications. Despite consistent patterns of assessment, there still does not exist a standard approach for assessing EHR data quality. Conclusion Guidelines are needed for EHR data quality assessment to improve the efficiency, transparency, comparability, and interoperability of data quality assessment. These guidelines must be both scalable and flexible. Automation could be helpful in generalizing this process. [ABSTRACT FROM AUTHOR]
Published: 2023
Full Text: View/download PDF

48. Privacy-protecting, reliable response data discovery using COVID-19 patient observations

Author: Kim, Jihoon, Neumann, Larissa, Paul, Paulina, Day, Michele E, Aratow, Michael, Bell, Douglas S, Doctor, Jason N, Hinske, Ludwig C, Jiang, Xiaoqian, Kim, Katherine K, Matheny, Michael E, Meeker, Daniella, Pletcher, Mark J, Schilling, Lisa M, SooHoo, Spencer, Xu, Hua, Zheng, Kai, Ohno-Machado, Lucila, Anderson, David M, Anderson, Nicholas R, Balacha, Chandrasekar, Bath, Tyler, Baxter, Sally L, Becker-Pennrich, Andrea, Bernstam, Elmer V, Carter, William A, Chau, Ngan, Choi, Yong, Covington, Steven, DuVall, Scott, El-Kareh, Robert, Florian, Renato, Follett, Robert W, Geisler, Benjamin P, Ghigi, Alessandro, Gottlieb, Assaf, Hu, Zhaoxian, Ir, Diana, Knight, Tara K, Koola, Jejo D, Kuo, Tsung-Ting, Lee, Nelson, Mansmann, Ulrich, Mou, Zongyang, Murphy, Robert E, Nguyen, Nghia H, Niedermayer, Sebastian, Park, Eunice, Perkins, Amy M, Post, Kai W, Rieder, Clemens, Scherer, Clemens, Soares, Andrey, Soysal, Ekin, Tep, Brian, Toy, Brian, Wang, Baocheng, Wu, Zhen R, Zhou, Yujia, and Zucker, Rachel A
Subjects: Information and Computing Sciences, Health Services and Systems, Health Sciences, Infectious Diseases, Emerging Infectious Diseases, Coronaviruses, Generic health relevance, Good Health and Well Being, Algorithms, COVID-19, Common Data Elements, Computer Communication Networks, Confidentiality, Electronic Health Records, Female, Humans, Information Storage and Retrieval, Logistic Models, Male, Natural Language Processing, Registries, observational study, common data elements, electronic health record, regression analysis, R2D2 Consortium, Engineering, Medical and Health Sciences, Medical Informatics, Biomedical and clinical sciences, Health sciences, Information and computing sciences
Abstract: ObjectiveTo utilize, in an individual and institutional privacy-preserving manner, electronic health record (EHR) data from 202 hospitals by analyzing answers to COVID-19-related questions and posting these answers online.Materials and methodsWe developed a distributed, federated network of 12 health systems that harmonized their EHRs and submitted aggregate answers to consortia questions posted at https://www.covid19questions.org. Our consortium developed processes and implemented distributed algorithms to produce answers to a variety of questions. We were able to generate counts, descriptive statistics, and build a multivariate, iterative regression model without centralizing individual-level data.ResultsOur public website contains answers to various clinical questions, a web form for users to ask questions in natural language, and a list of items that are currently pending responses. The results show, for example, that patients who were taking angiotensin-converting enzyme inhibitors and angiotensin II receptor blockers, within the year before admission, had lower unadjusted in-hospital mortality rates. We also showed that, when adjusted for, age, sex, and ethnicity were not significantly associated with mortality. We demonstrated that it is possible to answer questions about COVID-19 using EHR data from systems that have different policies and must follow various regulations, without moving data out of their health systems.Discussion and conclusionsWe present an alternative or a complement to centralized COVID-19 registries of EHR data. We can use multivariate distributed logistic regression on observations recorded in the process of care to generate results without transferring individual-level data outside the health systems.
Published: 2021

49. Validation of an internationally derived patient severity phenotype to support COVID-19 analytics from electronic health record data

Author: Klann, Jeffrey G, Estiri, Hossein, Weber, Griffin M, Moal, Bertrand, Avillach, Paul, Hong, Chuan, Tan, Amelia LM, Beaulieu-Jones, Brett K, Castro, Victor, Maulhardt, Thomas, Geva, Alon, Malovini, Alberto, South, Andrew M, Visweswaran, Shyam, Morris, Michele, Samayamuthu, Malarkodi J, Omenn, Gilbert S, Ngiam, Kee Yuan, Mandl, Kenneth D, Boeker, Martin, Olson, Karen L, Mowery, Danielle L, Follett, Robert W, Hanauer, David A, Bellazzi, Riccardo, Moore, Jason H, Loh, Ne-Hooi Will, Bell, Douglas S, Wagholikar, Kavishwar B, Chiovato, Luca, Tibollo, Valentina, Rieg, Siegbert, Li, Anthony LLJ, Jouhet, Vianney, Schriver, Emily, Xia, Zongqi, Hutch, Meghan, Luo, Yuan, Kohane, Isaac S, EHR, The Consortium for Clinical Characterization of COVID-19 by, Brat, Gabriel A, and Murphy, Shawn N
Subjects: Health Services and Systems, Health Sciences, Patient Safety, HIV/AIDS, Good Health and Well Being, COVID-19, Electronic Health Records, Hospitalization, Humans, Machine Learning, Prognosis, ROC Curve, Sensitivity and Specificity, Severity of Illness Index, novel coronavirus, disease severity, computable phenotype, medical informatics, data networking, data interoperability, Consortium for Clinical Characterization of COVID-19 by EHR (4CE), Information and Computing Sciences, Engineering, Medical and Health Sciences, Medical Informatics, Biomedical and clinical sciences, Health sciences, Information and computing sciences
Abstract: ObjectiveThe Consortium for Clinical Characterization of COVID-19 by EHR (4CE) is an international collaboration addressing coronavirus disease 2019 (COVID-19) with federated analyses of electronic health record (EHR) data. We sought to develop and validate a computable phenotype for COVID-19 severity.Materials and methodsTwelve 4CE sites participated. First, we developed an EHR-based severity phenotype consisting of 6 code classes, and we validated it on patient hospitalization data from the 12 4CE clinical sites against the outcomes of intensive care unit (ICU) admission and/or death. We also piloted an alternative machine learning approach and compared selected predictors of severity with the 4CE phenotype at 1 site.ResultsThe full 4CE severity phenotype had pooled sensitivity of 0.73 and specificity 0.83 for the combined outcome of ICU admission and/or death. The sensitivity of individual code categories for acuity had high variability-up to 0.65 across sites. At one pilot site, the expert-derived phenotype had mean area under the curve of 0.903 (95% confidence interval, 0.886-0.921), compared with an area under the curve of 0.956 (95% confidence interval, 0.952-0.959) for the machine learning approach. Billing codes were poor proxies of ICU admission, with as low as 49% precision and recall compared with chart review.DiscussionWe developed a severity phenotype using 6 code classes that proved resilient to coding variability across international institutions. In contrast, machine learning approaches may overfit hospital-specific orders. Manual chart review revealed discrepancies even in the gold-standard outcomes, possibly owing to heterogeneous pandemic conditions.ConclusionsWe developed an EHR-based severity phenotype for COVID-19 in hospitalized patients and validated it at 12 international sites.
Published: 2021

50. An interview study with medical scribes on how their work may alleviate clinician burnout through delegated health IT tasks.

Author: Tran, Brian D, Rosenbaum, Kathryn, and Zheng, Kai
Subjects: Information and Computing Sciences, Health Services and Systems, Health Sciences, Clinical Research, Generic health relevance, Burnout, Professional, Documentation, Electronic Health Records, Humans, Interviews as Topic, Medical Record Administrators, Qualitative Research, medical scribe, health information technology, professional burnout [C24.580.500], workflow [L01.906.893], documentation [L01.453.245], electronic health records [E05.318.308.940.968.625.500], Engineering, Medical and Health Sciences, Medical Informatics, Biomedical and clinical sciences, Health sciences, Information and computing sciences
Abstract: ObjectivesTo understand how medical scribes' work may contribute to alleviating clinician burnout attributable directly or indirectly to the use of health IT.Materials and methodsQualitative analysis of semistructured interviews with 32 participants who had scribing experience in a variety of clinical settings.ResultsWe identified 7 categories of clinical tasks that clinicians commonly choose to offload to medical scribes, many of which involve delegated use of health IT. These range from notes-taking and computerized data entry to foraging, assembling, and tracking information scattered across multiple clinical information systems. Some common characteristics shared among these tasks include: (1) time-consuming to perform; (2) difficult to remember or keep track of; (3) disruptive to clinical workflow, clinicians' cognitive processes, or patient-provider interactions; (4) perceived to be low-skill "clerical" work; and (5) deemed as adding no value to direct patient care.DiscussionThe fact that clinicians opt to "outsource" certain clinical tasks to medical scribes is a strong indication that performing these tasks is not perceived to be the best use of their time. Given that a vast majority of healthcare practices in the US do not have the luxury of affording medical scribes, the burden would inevitably fall onto clinicians' shoulders, which could be a major source for clinician burnout.ConclusionsMedical scribes help to offload a substantial amount of burden from clinicians-particularly with tasks that involve onerous interactions with health IT. Developing a better understanding of medical scribes' work provides useful insights into the sources of clinician burnout and potential solutions to it.
Published: 2021

Catalog

Books, media, physical & digital resources

See catalog results

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Language

Publication Type

Database

Publisher

965 results on '"Electronic Health Records"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources