557 results on '"Visweswaran, Shyam"'
Search Results
202. The application of network label propagation to rank biomarkers in genome-wide Alzheimer’s data
- Author
-
Stokes, Matthew E, primary, Barmada, M, additional, Kamboh, M, additional, and Visweswaran, Shyam, additional
- Published
- 2014
- Full Text
- View/download PDF
203. Semi-automated literature mining to identify putative biomarkers of disease from multiple biofluids
- Author
-
Jordan, Rick, primary, Visweswaran, Shyam, additional, and Gopalakrishnan, Vanathi, additional
- Published
- 2014
- Full Text
- View/download PDF
204. Informative Bayesian Model Selection: a method for identifying interactions in genome-wide data
- Author
-
Aflakparast, Mehran, primary, Masoudi-Nejad, Ali, additional, . Bozorgmehr, Joseph H, additional, and Visweswaran, Shyam, additional
- Published
- 2014
- Full Text
- View/download PDF
205. Identifying Genetic Interactions Associated with Late-Onset Alzheimer’s Disease
- Author
-
Floudas, Charalampos S, primary, Um, Nara, additional, Kamboh, M. Ilyas, additional, Barmada, Michael M, additional, and Visweswaran, Shyam, additional
- Published
- 2013
- Full Text
- View/download PDF
206. Deep Multiple Kernel Learning
- Author
-
Strobl, Eric V., primary and Visweswaran, Shyam, additional
- Published
- 2013
- Full Text
- View/download PDF
207. Assessing the Quality of Prescribing and Monitoring Erythropoiesis-Stimulating Agents in the Nursing Home Setting
- Author
-
Wong, An-Kwok I., Stephens, Scott B., Aspinall, Monica B., Visweswaran, Shyam, Hanlon, Joseph T., and Handler, Steven M.
- Published
- 2009
- Full Text
- View/download PDF
208. Learning Patient-Specific Models From Clinical Data
- Author
-
Visweswaran, Shyam and Visweswaran, Shyam
- Abstract
A key purpose of building a model from clinical data is to predict the outcomes of future individual patients. This work introduces a Bayesian patient-specific predictive framework for constructing predictive models from data that are optimized to predict well for a particular patient case. The construction of such patient-specific models is influenced by the particular history, symptoms, laboratory results, and other features of the patient case at hand. This approach is in contrast to the commonly used population-wide models that are constructed to perform well on average on all future cases.The new patient-specific method described in this research uses Bayesian network models, carries out Bayesian model averaging over a set of models to predict the outcome of interest for the patient case at hand, and employs a patient-specific heuristic to locate a set of suitable models to average over. Two versions of the method are developed that differ in the representation used for the conditional probability distributions in the Bayesian networks. One version uses a representation that captures only the so called global structure among the variables of a Bayesian network and the second representation captures additional local structure among the variables. The patient-specific methods were experimentally evaluated on one synthetic dataset, 21 UCI datasets and three medical datasets. Their performance was measured using five different performance measures and compared to that of several commonly used methods for constructing predictive models including naïve Bayes, C4.5 decision tree, logistic regression, neural networks, k-Nearest Neighbor and Lazy Bayesian Rules. Over all the datasets, both patient-specific methods performed better on average on all performance measures and against all the comparison algorithms. The global structure method that performs Bayesian model averaging in conjunction with the patient-specific search heuristic h
- Published
- 2008
209. An Algorithm for Network-Based Gene Prioritization That Encodes Knowledge Both in Nodes and in Links
- Author
-
Kimmel, Chad, primary and Visweswaran, Shyam, additional
- Published
- 2013
- Full Text
- View/download PDF
210. Mo1342 A Concept Recognition Tool to Identify the Surgical Complications of Crohn's Disease in Electronic Health Records
- Author
-
Visweswaran, Shyam, primary, Saul, Melissa I., additional, Espino, Jeremy U., additional, Levander, John, additional, Swoger, Jason M., additional, Regueiro, Miguel, additional, and Dunn, Michael A., additional
- Published
- 2013
- Full Text
- View/download PDF
211. Sa1379 Electronic Health Record (EHR) Information Is Useful to Predict Clinically Relevant Outcomes in Acute Pancreatitis (AP)
- Author
-
Yadav, Dhiraj, primary, Saul, Melissa I., additional, Papachristou, Georgios I., additional, Whitcomb, David C., additional, Visweswaran, Shyam, additional, and Dunn, Michael A., additional
- Published
- 2013
- Full Text
- View/download PDF
212. Detection of Patients with Influenza Syndrome Using Machine-Learning Models Learned from Emergency Department Reports
- Author
-
Lopez Pineda, Arturo, primary, Tsui, Fu-Chiang, additional, Visweswaran, Shyam, additional, and Cooper, Gregory F., additional
- Published
- 2013
- Full Text
- View/download PDF
213. Noninvasive Predictors of Subdural Grid Seizure Localization in Children With Nonlesional Focal Epilepsy
- Author
-
Kalamangalam, Giridhar P., primary, Pestana Knight, Elia M., additional, Visweswaran, Shyam, additional, and Gupta, Ajay, additional
- Published
- 2013
- Full Text
- View/download PDF
214. A multivariate probabilistic method for comparing two clinical datasets
- Author
-
Sverchkov, Yuriy, primary, Visweswaran, Shyam, additional, Clermont, Gilles, additional, Hauskrecht, Milos, additional, and Cooper, Gregory F., additional
- Published
- 2012
- Full Text
- View/download PDF
215. Application of an efficient Bayesian discretization method to biomedical data
- Author
-
Lustgarten, Jonathan L, primary, Visweswaran, Shyam, additional, Gopalakrishnan, Vanathi, additional, and Cooper, Gregory F, additional
- Published
- 2011
- Full Text
- View/download PDF
216. Learning genetic epistasis using Bayesian network scoring criteria
- Author
-
Jiang, Xia, primary, Neapolitan, Richard E, additional, Barmada, M Michael, additional, and Visweswaran, Shyam, additional
- Published
- 2011
- Full Text
- View/download PDF
217. Identifying genetic interactions in genome‐wide data using Bayesian networks
- Author
-
Jiang, Xia, primary, Barmada, M. Michael, additional, and Visweswaran, Shyam, additional
- Published
- 2010
- Full Text
- View/download PDF
218. Gene prioritization using a probabilistic knowledge model
- Author
-
Wang, Shuguang, primary, Hauskrecht, Milos, additional, and Visweswaran, Shyam, additional
- Published
- 2009
- Full Text
- View/download PDF
219. Noninvasive Correlates of Subdural Grid Electrographic Outcome
- Author
-
Kalamangalam, Giridhar P., primary, Morris, Harold H., additional, Mani, Jayanthi, additional, Lachhwani, Deepak K., additional, Visweswaran, Shyam, additional, and Bingaman, William M., additional
- Published
- 2009
- Full Text
- View/download PDF
220. Knowledge-based variable selection for learning rules from proteomic data
- Author
-
Lustgarten, Jonathan L, primary, Visweswaran, Shyam, additional, Bowser, Robert P, additional, Hogan, William R, additional, and Gopalakrishnan, Vanathi, additional
- Published
- 2009
- Full Text
- View/download PDF
221. Bayesian Combinatorial Partitioning For Detecting Interactions Among Genetic Variants
- Author
-
Visweswaran, Shyam and Wong, An-Kwok Ian
- Subjects
Articles - Abstract
Detecting epistatic (nolinear) interactions among single nucleotide polymorphisms (SNPs) at multiple loci is important in the analysis of genomic data in association studies. We developed a Bayesian combinatorial partitioning (BCP) for detecting such interactions among SNPs that are predictive of disease. When compared with multifactor dimensionality reduction (MDR), a widely used combinatorial partitioning method for detecting interactions, BCP has significantly greater power and is computationally more efficient.
- Published
- 2009
222. Detection of Very High–Level Penicillin-Resistant Variants of the Tennessee23F-4 Clone Via Single and Serial Transformations with Four Serotype 19A International Pneumococcal Clones
- Author
-
McEllistrem, M. Catherine, primary, Adams, Jennifer M., additional, Visweswaran, Shyam, additional, and Khan, Saleem A., additional
- Published
- 2005
- Full Text
- View/download PDF
223. A sequential text search algorithm is superior to adminstratively coded data in estimating wound dehiscence as a surgical patient safety indicator
- Author
-
Marderstein, Eric L., primary, Saul, Melissa, additional, Hanbury, Paul, additional, Visweswaran, Shyam, additional, Cooper, Gregory, additional, and Simmons, Richard, additional
- Published
- 2004
- Full Text
- View/download PDF
224. Serotype 14 Variants of the France 9V −3 Clone from Baltimore, Maryland, Can Be Differentiated by the cpsB Gene
- Author
-
McEllistrem, M. Catherine, primary, Noller, Anna C., additional, Visweswaran, Shyam, additional, Adams, Jennifer M., additional, and Harrison, Lee H., additional
- Published
- 2004
- Full Text
- View/download PDF
225. Mining Epistatic Interactions from High-Dimensional Data Sets.
- Author
-
Jiang, Xia, Visweswaran, Shyam, and Neapolitan, Richard E.
- Published
- 2012
- Full Text
- View/download PDF
226. Distinguishing Admissions Specifically for COVID-19 from Incidental SARS-CoV-2 Admissions: A National Retrospective EHR Study.
- Author
-
Klann, Jeffrey G, Strasser, Zachary H, Hutch, Meghan R, Kennedy, Chris J, Marwaha, Jayson S, Morris, Michele, Samayamuthu, Malarkodi Jebathilagam, Pfaff, Ashley C, Estiri, Hossein, South, Andrew M, Weber, Griffin M, Yuan, William, Avillach, Paul, Wagholikar, Kavishwar B, Luo, Yuan, (4CE), The Consortium for Clinical Characterization of COVID-19 by EHR, Omenn, Gilbert S, Visweswaran, Shyam, Holmes, John H, and Xia, Zongqi
- Abstract
Background: Admissions are generally classified as COVID-19 hospitalizations if the patient has a positive SARS-CoV-2 polymerase chain reaction (PCR) test. However, because 35% of SARS-CoV-2 infections are asymptomatic, patients admitted for unrelated indications with an incidentally positive test could be misclassified as a COVID-19 hospitalization. EHR-based studies have been unable to distinguish between a hospitalization specifically for COVID-19 versus an incidental SARS-CoV-2 hospitalization. Although the need to improve classification of COVID-19 disease vs. incidental SARS-CoV-2 is well understood, the magnitude of the problems has only been characterized in small, single-center studies. Furthermore, there have been no peer-reviewed studies evaluating methods for improving classification.Objective: The aims of this study were to: first, quantify the frequency of incidental hospitalizations over the first fifteen months of the pandemic in multiple hospital systems in the United States; and second, to apply electronic phenotyping techniques to automatically improve COVID-19 hospitalization classification.Methods: From a retrospective EHR-based cohort in four US healthcare systems in Massachusetts, Pennsylvania, and Illinois, a random sample of 1,123 SARS-CoV-2 PCR-positive patients hospitalized between 3/2020-8/2021 was manually chart-reviewed and classified as admitted-with-COVID-19 (incidental) vs. specifically admitted for COVID-19 (for-COVID-19). EHR-based phenotyping was used to find feature sets to filter out incidental admissions.Results: EHR-based phenotyped feature sets filtered out incidental admissions, which occurred in an average of 26% of hospitalizations (although this varied widely over time, from 0%-75%). The top site-specific feature sets had 79-99% specificity with 62-75% sensitivity, while the best performing across-site feature sets had 71-94% specificity with 69-81% sensitivity.Conclusions: A large proportion of SARS-CoV-2 PCR-positive admissions were incidental. Straightforward EHR-based phenotypes differentiated admissions, which is important to assure accurate public health reporting and research.Clinicaltrial: [ABSTRACT FROM AUTHOR]- Published
- 2022
- Full Text
- View/download PDF
227. Outlier detection for patient monitoring and alerting.
- Author
-
Hauskrecht, Milos, Batal, Iyad, Valko, Michal, Visweswaran, Shyam, Cooper, Gregory F., and Clermont, Gilles
- Abstract
Abstract: We develop and evaluate a data-driven approach for detecting unusual (anomalous) patient-management decisions using past patient cases stored in electronic health records (EHRs). Our hypothesis is that a patient-management decision that is unusual with respect to past patient care may be due to an error and that it is worthwhile to generate an alert if such a decision is encountered. We evaluate this hypothesis using data obtained from EHRs of 4486 post-cardiac surgical patients and a subset of 222 alerts generated from the data. We base the evaluation on the opinions of a panel of experts. The results of the study support our hypothesis that the outlier-based alerting can lead to promising true alert rates. We observed true alert rates that ranged from 25% to 66% for a variety of patient-management actions, with 66% corresponding to the strongest outliers. [Copyright &y& Elsevier]
- Published
- 2013
- Full Text
- View/download PDF
228. Building an automated SOAP classifier for emergency department reports.
- Author
-
Mowery, Danielle, Wiebe, Janyce, Visweswaran, Shyam, Harkema, Henk, and Chapman, Wendy W.
- Abstract
Abstract: Information extraction applications that extract structured event and entity information from unstructured text can leverage knowledge of clinical report structure to improve performance. The Subjective, Objective, Assessment, Plan (SOAP) framework, used to structure progress notes to facilitate problem-specific, clinical decision making by physicians, is one example of a well-known, canonical structure in the medical domain. Although its applicability to structuring data is understood, its contribution to information extraction tasks has not yet been determined. The first step to evaluating the SOAP framework’s usefulness for clinical information extraction is to apply the model to clinical narratives and develop an automated SOAP classifier that classifies sentences from clinical reports. In this quantitative study, we applied the SOAP framework to sentences from emergency department reports, and trained and evaluated SOAP classifiers built with various linguistic features. We found the SOAP framework can be applied manually to emergency department reports with high agreement (Cohen’s kappa coefficients over 0.70). Using a variety of features, we found classifiers for each SOAP class can be created with moderate to outstanding performance with F
1 scores of 93.9 (subjective), 94.5 (objective), 75.7 (assessment), and 77.0 (plan). We look forward to expanding the framework and applying the SOAP classification to clinical information extraction tasks. [Copyright &y& Elsevier]- Published
- 2012
- Full Text
- View/download PDF
229. Computerized detection of adverse drug reactions in the medical intensive care unit
- Author
-
Kane-Gill, Sandra L., Visweswaran, Shyam, Saul, Melissa I., Wong, An-Kwok Ian, Penrod, Louis E., and Handler, Steven M.
- Subjects
- *
PHARMACODYNAMICS , *DRUG side effects , *INTENSIVE care units , *BLOOD testing , *PREDICTION models , *VANCOMYCIN , *PUBLIC health surveillance - Abstract
Abstract: Objective: Clinical event monitors are a type of active medication monitoring system that can use signals to alert clinicians to possible adverse drug reactions. The primary goal was to evaluate the positive predictive values of select signals used to automate the detection of ADRs in the medical intensive care unit. Method: This is a prospective, case series of adult patients in the medical intensive care unit during a six-week period who had one of five signals presents: an elevated blood urea nitrogen, vancomycin, or quinidine concentration, or a low sodium or glucose concentration. Alerts were assessed using 3 objective published adverse drug reaction determination instruments. An event was considered an adverse drug reaction when 2 out of 3 instruments had agreement of possible, probable or definite. Positive predictive values were calculated as the proportion of alerts that occurred, divided by the number of times that alerts occurred and adverse drug reactions were confirmed. Results: 145 patients were eligible for evaluation. For the 48 patients (50% male) having an alert, the mean±SD age was 62±19 years. A total of 253 alerts were generated. Positive predictive values were 1.0, 0.55, 0.38 and 0.33 for vancomycin, glucose, sodium, and blood urea nitrogen, respectively. A quinidine alert was not generated during the evaluation. Conclusions: Computerized clinical event monitoring systems should be considered when developing methods to detect adverse drug reactions as part of intensive care unit patient safety surveillance systems, since they can automate the detection of these events using signals that have good performance characteristics by processing commonly available laboratory and medication information. [Copyright &y& Elsevier]
- Published
- 2011
- Full Text
- View/download PDF
230. Learning genetic epistasis using Bayesian network scoring criteria.
- Author
-
Xia Jiang, Neapolitan, Richard E., Barmada, M. Michael, and Visweswaran, Shyam
- Subjects
EPISTASIS (Genetics) ,GENE expression ,DATA mining ,MACHINE learning ,LEARNING ,ALZHEIMER'S disease - Abstract
Background: Gene-gene epistatic interactions likely play an important role in the genetic basis of many common diseases. Recently, machine-learning and data mining methods have been developed for learning epistatic relationships from data. A well-known combinatorial method that has been successfully applied for detecting epistasis is Multifactor Dimensionality Reduction (MDR). Jiang et al. created a combinatorial epistasis learning method called BNMBL to learn Bayesian network (BN) epistatic models. They compared BNMBL to MDR using simulated data sets. Each of these data sets was generated from a model that associates two SNPs with a disease and includes 18 unrelated SNPs. For each data set, BNMBL and MDR were used to score all 2-SNP models, and BNMBL learned significantly more correct models. In real data sets, we ordinarily do not know the number of SNPs that influence phenotype. BNMBL may not perform as well if we also scored models containing more than two SNPs. Furthermore, a number of other BN scoring criteria have been developed. They may detect epistatic interactions even better than BNMBL. Although BNs are a promising tool for learning epistatic relationships from data, we cannot confidently use them in this domain until we determine which scoring criteria work best or even well when we try learning the correct model without knowledge of the number of SNPs in that model. Results: We evaluated the performance of 22 BN scoring criteria using 28,000 simulated data sets and a real Alzheimer's GWAS data set. Our results were surprising in that the Bayesian scoring criterion with large values of a hyperparameter called α performed best. This score performed better than other BN scoring criteria and MDR at recall using simulated data sets, at detecting the hardest-to-detect models using simulated data sets, and at substantiating previous results using the real Alzheimer's data set. Conclusions: We conclude that representing epistatic interactions using BN models and scoring them using a BN scoring criterion holds promise for identifying epistatic genetic variants in data. In particular, the Bayesian scoring criterion with large values of a hyperparameter α appears more promising than a number of alternatives. [ABSTRACT FROM AUTHOR]
- Published
- 2011
- Full Text
- View/download PDF
231. Learning Instance-Specific Predictive Models.
- Author
-
Visweswaran, Shyam, Cooper, Gregory F., and Chickering, Max
- Subjects
- *
MACHINE learning , *PREDICTION models , *BAYESIAN analysis , *ALGORITHMS , *DATA analysis , *COMPUTER networks , *REGRESSION analysis - Abstract
This paper introduces a Bayesian algorithm for constructing predictive models from data that are optimized to predict a target variable well for a particular instance. This algorithm learns Markov blanket models, carries out Bayesian model averaging over a set of models to predict a target variable of the instance at hand, and employs an instance-specific heuristic to locate a set of suitable models to average over. We call this method the instance-specific Markov blanket (ISMB) algorithm. The ISMB algorithm was evaluated on 21 UCI data sets using five different performance measures and its performance was compared to that of several commonly used predictive algorithms, including nave Bayes, C4.5 decision tree, logistic regression, neural networks, k-Nearest Neighbor, Lazy Bayesian Rules, and AdaBoost. Over all the data sets, the ISMB algorithm performed better on average on all performance measures against all the comparison algorithms. [ABSTRACT FROM AUTHOR]
- Published
- 2010
232. Learning patient-specific predictive models from clinical data.
- Author
-
Visweswaran, Shyam, Angus, Derek C., Hsieh, Margaret, Weissfeld, Lisa, Yealy, Donald, and Cooper, Gregory F.
- Abstract
Abstract: We introduce an algorithm for learning patient-specific models from clinical data to predict outcomes. Patient-specific models are influenced by the particular history, symptoms, laboratory results, and other features of the patient case at hand, in contrast to the commonly used population-wide models that are constructed to perform well on average on all future cases. The patient-specific algorithm uses Markov blanket (MB) models, carries out Bayesian model averaging over a set of models to predict the outcome for the patient case at hand, and employs a patient-specific heuristic to locate a set of suitable models to average over. We evaluate the utility of using a local structure representation for the conditional probability distributions in the MB models that captures additional independence relations among the variables compared to the typically used representation that captures only the global structure among the variables. In addition, we compare the performance of Bayesian model averaging to that of model selection. The patient-specific algorithm and its variants were evaluated on two clinical datasets for two outcomes. Our results provide support that the performance of an algorithm for learning patient-specific models can be improved by using a local structure representation for MB models and by performing Bayesian model averaging. [Copyright &y& Elsevier]
- Published
- 2010
- Full Text
- View/download PDF
233. The All of UsResearch Program: Data quality, utility, and diversity
- Author
-
Ramirez, Andrea H., Sulieman, Lina, Schlueter, David J., Halvorson, Alese, Qian, Jun, Ratsimbazafy, Francis, Loperena, Roxana, Mayo, Kelsey, Basford, Melissa, Deflaux, Nicole, Muthuraman, Karthik N., Natarajan, Karthik, Kho, Abel, Xu, Hua, Wilkins, Consuelo, Anton-Culver, Hoda, Boerwinkle, Eric, Cicek, Mine, Clark, Cheryl R., Cohn, Elizabeth, Ohno-Machado, Lucila, Schully, Sheri D., Ahmedani, Brian K., Argos, Maria, Cronin, Robert M., O’Donnell, Christopher, Fouad, Mona, Goldstein, David B., Greenland, Philip, Hebbring, Scott J., Karlson, Elizabeth W., Khatri, Parinda, Korf, Bruce, Smoller, Jordan W., Sodeke, Stephen, Wilbanks, John, Hentges, Justin, Mockrin, Stephen, Lunt, Christopher, Devaney, Stephanie A., Gebo, Kelly, Denny, Joshua C., Carroll, Robert J., Glazer, David, Harris, Paul A., Hripcsak, George, Philippakis, Anthony, Roden, Dan M., Ahmedani, Brian, Cole Johnson, Christine D., Ahsan, Habib, Antoine-LaVigne, Donna, Singleton, Glendora, Anton-Culver, Hoda, Topol, Eric, Baca-Motes, Katie, Steinhubl, Steven, Wade, James, Begale, Mark, Jain, Praduman, Sutherland, Scott, Lewis, Beth, Korf, Bruce, Behringer, Melissa, Gharavi, Ali G., Goldstein, David B., Hripcsak, George, Bier, Louise, Boerwinkle, Eric, Brilliant, Murray H., Murali, Narayana, Hebbring, Scott Joseph, Farrar-Edwards, Dorothy, Burnside, Elizabeth, Drezner, Marc K., Taylor, Amy, Channamsetty, Veena, Montalvo, Wanda, Sharma, Yashoda, Chinea, Carmen, Jenks, Nancy, Cicek, Mine, Thibodeau, Steve, Holmes, Beverly Wilson, Schlueter, Eric, Collier, Ever, Winkler, Joyce, Corcoran, John, D’Addezio, Nick, Daviglus, Martha, Winn, Robert, Wilkins, Consuelo, Roden, Dan, Denny, Joshua, Doheny, Kim, Nickerson, Debbie, Eichler, Evan, Jarvik, Gail, Funk, Gretchen, Philippakis, Anthony, Rehm, Heidi, Lennon, Niall, Kathiresan, Sekar, Gabriel, Stacey, Gibbs, Richard, Gil Rico, Edgar M., Glazer, David, Grand, Joannie, Greenland, Philip, Harris, Paul, Shenkman, Elizabeth, Hogan, William R., Igho-Pemu, Priscilla, Pollan, Cliff, Jorge, Milena, Okun, Sally, Karlson, Elizabeth W., Smoller, Jordan, Murphy, Shawn N., Ross, Margaret Elizabeth, Kaushal, Rainu, Winford, Eboni, Wallace, Febe, Khatri, Parinda, Kheterpal, Vik, Ojo, Akinlolu, Moreno, Francisco A., Kron, Irving, Peterson, Rachele, Menon, Usha, Lattimore, Patricia Watkins, Leviner, Noga, Obedin-Maliver, Juno, Lunn, Mitchell, Malik-Gagnon, Lynda, Mangravite, Lara, Marallo, Adria, Marroquin, Oscar, Visweswaran, Shyam, Reis, Steven, Marshall, Gailen, McGovern, Patrick, Mignucci, Deb, Moore, John, Munoz, Fatima, Talavera, Gregory, O'Connor, George T., O'Donnell, Christopher, Ohno-Machado, Lucila, Orr, Greg, Randal, Fornessa, Theodorou, Andreas A., Reiman, Eric, Roxas-Murray, Mercedita, Stark, Louisa, Tepp, Ronnie, Zhou, Alicia, Topper, Scott, Trousdale, Rhonda, Tsao, Phil, Weidman, Lisa, Weiss, Scott T., Wellis, David, Whittle, Jeffrey, Wilson, Amanda, Zuchner, Stephan, and Zwick, Michael E.
- Abstract
The All of UsResearch Program seeks to engage at least one million diverse participants to advance precision medicine and improve human health. We describe here the cloud-based Researcher Workbench that uses a data passport model to democratize access to analytical tools and participant information including survey, physical measurement, and electronic health record (EHR) data. We also present validation study findings for several common complex diseases to demonstrate use of this novel platform in 315,000 participants, 78% of whom are from groups historically underrepresented in biomedical research, including 49% self-reporting non-White races. Replication findings include medication usage pattern differences by race in depression and type 2 diabetes, validation of known cancer associations with smoking, and calculation of cardiovascular risk scores by reported race effects. The cloud-based Researcher Workbench represents an important advance in enabling secure access for a broad range of researchers to this large resource and analytical tools.
- Published
- 2022
- Full Text
- View/download PDF
234. Detection of Very High–Level Penicillin-Resistant Variants of the Tennessee 23F-4 Clone Via Single and Serial Transformations with Four Serotype 19A International Pneumococcal Clones
- Author
-
McEllistrem, M. Catherine, Adams, Jennifer M., Visweswaran, Shyam, and Khan, Saleem A.
- Abstract
In the United States, penicillin-resistant variants of the Tennessee (Tenn) 23F-4 clone account for a substantial proportion of the very-high-level penicillin-resistant (MIC 8 µg/ml) infections in the 7-valent pneumococcal protein conjugate vaccine (PCV7) era. Serotype 19A strains account for an increasing proportion of penicillin-nonsusceptible Streptococcus pneumoniae infections. Sequential transformations of the Tenn 23F-4 clone (penicillin MIC 0.1 µg/ml) were performed with four penicillin-nonsusceptible serotype 19A international clones (penicillin MIC): S. Africa 19A-7 (0.5 µg/ml), Hungary 19A-6 (2 µg/ml), Slovakia 19A-11 (8 µg/ml), and South Africa 19A-13 (8 µg/ml). Fifty-two transformants were characterized by MICs, serogroup-specific PCR, pbp PCR restriction profile and sequence, pspA PCR restriction profile, and erm/mef PCR. A subset was analyzed with multilocus sequence typing (MLST) and pulsed-field gel electrophoresis. Serotype 23F transformants with penicillin MIC ≥ 8 µg/ml were detected through a single transformation with the Hungary 19A-6 clone or serial transformations using two to three different clones. Forty-four percent (14/32) of the transformants incorporated ≥1 new MLST allele. Using encapsulated donors, very-high-level penicillinresistant variants of the Tenn 23F-4 clone were detected. In addition to detecting stepwise increases in penicillin MIC, a 12-fold increase in penicillin MIC was achieved through a single transformation. This large increase in MIC may explain why this clone is commonly associated with very-high-level resistance in natural populations. Recombination within the MLST housekeeping genes was commonly detected in the transformants that had acquired penicillin resistance.
- Published
- 2005
- Full Text
- View/download PDF
235. Serotype 14 Variants of the France 9V-3Clone from Baltimore, Maryland, Can Be Differentiated by the cpsBGene
- Author
-
McEllistrem, M. Catherine, Noller, Anna C., Visweswaran, Shyam, Adams, Jennifer M., and Harrison, Lee H.
- Abstract
ABSTRACTEuropean serotype 14 variants of the France 9V-3clone, which have arisen through recombination events involving the penicillin binding protein 1a (pbp1a) gene, have cpsBsequences distinct from those of the 9V-3clone. Serotype 14 variants of the 9V-3clone have not been compared to genetically diverse serotype 14 strains isolated from an entire metropolitan area in the United States. All serotype 14 non-penicillin-susceptible Streptococcus pneumoniaestrains causing invasive disease in Baltimore, Md., from 1995 to 1996 were compared by using pulsed-field gel electrophoresis (PFGE), multilocus sequence typing (MLST), pbp1aPCR restriction profiles, and cpsBand pbp1asequences. The cpsBgenes from strains of 13 serotypes also were analyzed to assess the correlation with serotype. Twenty-seven percent (3 of 11) of the serotype 14 strains were related by PFGE and MLST to the 9V-3clone. The serotype 14 variants from Baltimore, unlike the European variants, were related neither to the 9V-3clone nor to the R6 strain from positions 1498 to 1710 of the pbp1agene. All serotype 14 strains had cpsBsequences that differed by =1% (0 to 5 of 476 bp) from each other and that were =16% (78 to 83 of 476 bp) divergent from that of the 9V-3clone. Allowing for a 2-bp difference in the cpsBsequence resulted in the highest correlation between the cpsBgene and serotype. Overall, 95% (84 of 88) of the strains were classified correctly by serotype with the cpsBsequence. The distal recombination site of the Baltimore serotype 14 variants of the 9V-3clone was not identical to that of the European serotype 14 variants. The cpsBgene was serotype specific regardless of whether capsular switching occurred. Although the correlation between serotype and the cpsBsequence was high, the overall diversity of the cpsBgene within a serotype likely will limit the role of this gene in a sequence-based serotyping method.
- Published
- 2004
- Full Text
- View/download PDF
236. International Analysis of Electronic Health Records of Children and Youth Hospitalized With COVID-19 Infection in 6 Countries
- Author
-
Bourgeois, Florence T., Gutiérrez-Sacristán, Alba, Keller, Mark S., Liu, Molei, Hong, Chuan, Bonzel, Clara-Lea, Tan, Amelia L. M., Aronow, Bruce J., Boeker, Martin, Booth, John, Cruz Rojo, Jaime, Devkota, Batsal, García Barrio, Noelia, Gehlenborg, Nils, Geva, Alon, Hanauer, David A., Hutch, Meghan R., Issitt, Richard W., Klann, Jeffrey G., Luo, Yuan, Mandl, Kenneth D., Mao, Chengsheng, Moal, Bertrand, Moshal, Karyn L., Murphy, Shawn N., Neuraz, Antoine, Ngiam, Kee Yuan, Omenn, Gilbert S, Patel, Lav P., Jiménez, Miguel Pedrera, Sebire, Neil J., Balazote, Pablo Serrano, Serret-Larmande, Arnaud, South, Andrew M., Spiridou, Anastasia, Taylor, Deanne M., Tippmann, Patric, Visweswaran, Shyam, Weber, Griffin M., Kohane, Isaac S., Cai, Tianxi, and Avillach, Paul
- Abstract
IMPORTANCE: Additional sources of pediatric epidemiological and clinical data are needed to efficiently study COVID-19 in children and youth and inform infection prevention and clinical treatment of pediatric patients. OBJECTIVE: To describe international hospitalization trends and key epidemiological and clinical features of children and youth with COVID-19. DESIGN, SETTING, AND PARTICIPANTS: This retrospective cohort study included pediatric patients hospitalized between February 2 and October 10, 2020. Patient-level electronic health record (EHR) data were collected across 27 hospitals in France, Germany, Spain, Singapore, the UK, and the US. Patients younger than 21 years who tested positive for COVID-19 and were hospitalized at an institution participating in the Consortium for Clinical Characterization of COVID-19 by EHR were included in the study. MAIN OUTCOMES AND MEASURES: Patient characteristics, clinical features, and medication use. RESULTS: There were 347 males (52%; 95% CI, 48.5-55.3) and 324 females (48%; 95% CI, 44.4-51.3) in this study’s cohort. There was a bimodal age distribution, with the greatest proportion of patients in the 0- to 2-year (199 patients [30%]) and 12- to 17-year (170 patients [25%]) age range. Trends in hospitalizations for 671 children and youth found discrete surges with variable timing across 6 countries. Data from this cohort mirrored national-level pediatric hospitalization trends for most countries with available data, with peaks in hospitalizations during the initial spring surge occurring within 23 days in the national-level and 4CE data. A total of 27?364 laboratory values for 16 laboratory tests were analyzed, with mean values indicating elevations in markers of inflammation (C-reactive protein, 83 mg/L; 95% CI, 53-112 mg/L; ferritin, 417 ng/mL; 95% CI, 228-607 ng/mL; and procalcitonin, 1.45 ng/mL; 95% CI, 0.13-2.77 ng/mL). Abnormalities in coagulation were also evident (D-dimer, 0.78 ug/mL; 95% CI, 0.35-1.21 ug/mL; and fibrinogen, 477 mg/dL; 95% CI, 385-569 mg/dL). Cardiac troponin, when checked (n?=?59), was elevated (0.032 ng/mL; 95% CI, 0.000-0.080 ng/mL). Common complications included cardiac arrhythmias (15.0%; 95% CI, 8.1%-21.7%), viral pneumonia (13.3%; 95% CI, 6.5%-20.1%), and respiratory failure (10.5%; 95% CI, 5.8%-15.3%). Few children were treated with COVID-19–directed medications. CONCLUSIONS AND RELEVANCE: This study of EHRs of children and youth hospitalized for COVID-19 in 6 countries demonstrated variability in hospitalization trends across countries and identified common complications and laboratory abnormalities in children and youth with COVID-19 infection. Large-scale informatics-based approaches to integrate and analyze data across health care systems complement methods of disease surveillance and advance understanding of epidemiological and clinical features associated with COVID-19 in children and youth.
- Published
- 2021
- Full Text
- View/download PDF
237. Machine Learning Classifiers for Twitter Surveillance of Vaping: Comparative Machine Learning Study.
- Author
-
Visweswaran, Shyam, Colditz, Jason B, O'Halloran, Patrick, Han, Na-Rae, Taneja, Sanya B, Welling, Joel, Chu, Kar-Hai, Sidani, Jaime E, and Primack, Brian A
- Subjects
DEEP learning ,MACHINE learning ,ELECTRONIC cigarettes ,CONVOLUTIONAL neural networks ,BIG data ,RECEIVER operating characteristic curves ,PUBLIC health surveillance ,RESEARCH ,SOCIAL media ,RESEARCH methodology ,EVALUATION research ,MEDICAL cooperation ,COMPARATIVE studies ,RESEARCH funding ,LONGITUDINAL method - Abstract
Background: Twitter presents a valuable and relevant social media platform to study the prevalence of information and sentiment on vaping that may be useful for public health surveillance. Machine learning classifiers that identify vaping-relevant tweets and characterize sentiments in them can underpin a Twitter-based vaping surveillance system. Compared with traditional machine learning classifiers that are reliant on annotations that are expensive to obtain, deep learning classifiers offer the advantage of requiring fewer annotated tweets by leveraging the large numbers of readily available unannotated tweets.Objective: This study aims to derive and evaluate traditional and deep learning classifiers that can identify tweets relevant to vaping, tweets of a commercial nature, and tweets with provape sentiments.Methods: We continuously collected tweets that matched vaping-related keywords over 2 months from August 2018 to October 2018. From this data set of tweets, a set of 4000 tweets was selected, and each tweet was manually annotated for relevance (vape relevant or not), commercial nature (commercial or not), and sentiment (provape or not). Using the annotated data, we derived traditional classifiers that included logistic regression, random forest, linear support vector machine, and multinomial naive Bayes. In addition, using the annotated data set and a larger unannotated data set of tweets, we derived deep learning classifiers that included a convolutional neural network (CNN), long short-term memory (LSTM) network, LSTM-CNN network, and bidirectional LSTM (BiLSTM) network. The unannotated tweet data were used to derive word vectors that deep learning classifiers can leverage to improve performance.Results: LSTM-CNN performed the best with the highest area under the receiver operating characteristic curve (AUC) of 0.96 (95% CI 0.93-0.98) for relevance, all deep learning classifiers including LSTM-CNN performed better than the traditional classifiers with an AUC of 0.99 (95% CI 0.98-0.99) for distinguishing commercial from noncommercial tweets, and BiLSTM performed the best with an AUC of 0.83 (95% CI 0.78-0.89) for provape sentiment. Overall, LSTM-CNN performed the best across all 3 classification tasks.Conclusions: We derived and evaluated traditional machine learning and deep learning classifiers to identify vaping-related relevant, commercial, and provape tweets. Overall, deep learning classifiers such as LSTM-CNN had superior performance and had the added advantage of requiring no preprocessing. The performance of these classifiers supports the development of a vaping surveillance system. [ABSTRACT FROM AUTHOR]- Published
- 2020
- Full Text
- View/download PDF
238. Leveraging Eye Tracking to Prioritize Relevant Medical Record Data: Comparative Machine Learning Study.
- Author
-
King, Andrew J, Cooper, Gregory F, Clermont, Gilles, Hochheiser, Harry, Hauskrecht, Milos, Sittig, Dean F, and Visweswaran, Shyam
- Subjects
EYE tracking ,MACHINE learning ,INFORMATION-seeking behavior ,RECEIVER operating characteristic curves ,DECISION support systems ,MEDICAL records ,EYE movements ,PSYCHOLOGICAL tests ,PATIENT-family relations ,RESEARCH funding - Abstract
Background: Electronic medical record (EMR) systems capture large amounts of data per patient and present that data to physicians with little prioritization. Without prioritization, physicians must mentally identify and collate relevant data, an activity that can lead to cognitive overload. To mitigate cognitive overload, a Learning EMR (LEMR) system prioritizes the display of relevant medical record data. Relevant data are those that are pertinent to a context-defined as the combination of the user, clinical task, and patient case. To determine which data are relevant in a specific context, a LEMR system uses supervised machine learning models of physician information-seeking behavior. Since obtaining information-seeking behavior data via manual annotation is slow and expensive, automatic methods for capturing such data are needed.Objective: The goal of the research was to propose and evaluate eye tracking as a high-throughput method to automatically acquire physician information-seeking behavior useful for training models for a LEMR system.Methods: Critical care medicine physicians reviewed intensive care unit patient cases in an EMR interface developed for the study. Participants manually identified patient data that were relevant in the context of a clinical task: preparing a patient summary to present at morning rounds. We used eye tracking to capture each physician's gaze dwell time on each data item (eg, blood glucose measurements). Manual annotations and gaze dwell times were used to define target variables for developing supervised machine learning models of physician information-seeking behavior. We compared the performance of manual selection and gaze-derived models on an independent set of patient cases.Results: A total of 68 pairs of manual selection and gaze-derived machine learning models were developed from training data and evaluated on an independent evaluation data set. A paired Wilcoxon signed-rank test showed similar performance of manual selection and gaze-derived models on area under the receiver operating characteristic curve (P=.40).Conclusions: We used eye tracking to automatically capture physician information-seeking behavior and used it to train models for a LEMR system. The models that were trained using eye tracking performed like models that were trained using manual annotations. These results support further development of eye tracking as a high-throughput method for training clinical decision support systems that prioritize the display of relevant medical record data. [ABSTRACT FROM AUTHOR]- Published
- 2020
- Full Text
- View/download PDF
239. Using machine learning to selectively highlight patient information.
- Author
-
King, Andrew J., Cooper, Gregory F., Clermont, Gilles, Hochheiser, Harry, Hauskrecht, Milos, Sittig, Dean F., and Visweswaran, Shyam
- Abstract
Background: Electronic medical record (EMR) systems need functionality that decreases cognitive overload by drawing the clinician's attention to the right data, at the right time. We developed a Learning EMR (LEMR) system that learns statistical models of clinician information-seeking behavior and applies those models to direct the display of data in future patients. We evaluated the performance of the system in identifying relevant patient data in intensive care unit (ICU) patient cases.Methods: To capture information-seeking behavior, we enlisted critical care medicine physicians who reviewed a set of patient cases and selected data items relevant to the task of presenting at morning rounds. Using patient EMR data as predictors, we built machine learning models to predict their relevancy. We prospectively evaluated the predictions of a set of high performing models.Results: On an independent evaluation data set, 25 models achieved precision of 0.52, 95% CI [0.49, 0.54] and recall of 0.77, 95% CI [0.75, 0.80] in identifying relevant patient data items. For data items missed by the system, the reviewers rated the effect of not seeing those data from no impact to minor impact on patient care in about 82% of the cases.Conclusion: Data-driven approaches for adaptively displaying data in EMR systems, like the LEMR system, show promise in using information-seeking behavior of clinicians to identify and highlight relevant patient data. [ABSTRACT FROM AUTHOR]- Published
- 2019
- Full Text
- View/download PDF
240. Additional file 4: of On Predicting lung cancer subtypes using ‘omic’ data from tumor and tumor-adjacent histologically-normal tissue
- Author
-
Pineda, Arturo López, Ogoe, Henry, Jeya Balasubramanian, Escareño, Claudia Rangel, Visweswaran, Shyam, Herman, James, and Vanathi Gopalakrishnan
- Subjects
surgical procedures, operative ,bacterial infections and mycoses ,neoplasms ,digestive system ,digestive system diseases ,3. Good health - Abstract
Appendix A shows the Cancer Genome Atlas annotations to identify the types of samples used in this study.Appendix B shows additional performance measures for the models described. (DOCX 106 kb)
241. Approximate Kernel-Based Conditional Independence Tests for Fast Non-Parametric Causal Discovery
- Author
-
Strobl, Eric V., Zhang, Kun, and Visweswaran, Shyam
- Abstract
Constraint-based causal discovery (CCD) algorithms require fast and accurate conditional independence (CI) testing. The Kernel Conditional Independence Test (KCIT) is currently one of the most popular CI tests in the non-parametric setting, but many investigators cannot use KCIT with large datasets because the test scales at least quadratically with sample size. We therefore devise two relaxations called the Randomized Conditional Independence Test (RCIT) and the Randomized conditional Correlation Test (RCoT) which both approximate KCIT by utilizing random Fourier features. In practice, both of the proposed tests scale linearly with sample size and return accurate p-values much faster than KCIT in the large sample size context. CCD algorithms run with RCIT or RCoT also return graphs at least as accurate as the same algorithms run with KCIT but with large reductions in run time.
- Published
- 2019
- Full Text
- View/download PDF
242. A new method for estimating the probability of causal relationships from observational data: Application to the study of the short-term effects of air pollution on cardiovascular and respiratory disease.
- Author
-
Andrews, Bryan, Wongchokprasitti, Chirayu, Visweswaran, Shyam, Lakhani, Chirag M., Patel, Chirag J., and Cooper, Gregory F.
- Subjects
- *
RESPIRATORY diseases , *AIR pollution , *AIR pollutants , *CARDIOVASCULAR diseases , *VECTOR error-correction models , *PROBABILITY theory , *LOAD forecasting (Electric power systems) - Abstract
In this paper we investigate which airborne pollutants have a short-term causal effect on cardiovascular and respiratory disease using the Ancestral Probabilities (AP) procedure, a novel Bayesian approach for deriving the probabilities of causal relationships from observational data. The results are largely consistent with EPA assessments of causality, however, in a few cases AP suggests that some pollutants thought to cause cardiovascular or respiratory disease are associated due purely to confounding. The AP procedure utilizes maximal ancestral graph (MAG) models to represent and assign probabilities to causal relationships while accounting for latent confounding. The algorithm does so locally by marginalizing over models with and without causal features of interest. Before applying AP to real data, we evaluate it in a simulation study and investigate the benefits of providing background knowledge. Overall, the results suggest that AP is an effective tool for causal discovery. • We introduce the AP procedure for estimating the probability of causal relationships. • AP takes observational data as input and accounts for possible latent confounding. • We evaluate AP on simulated data with and without background knowledge. • We investigate what airborne pollutants cause cardiovascular and respiratory disease. • The results are largely consistent with the EPA's assessments of causality. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
243. Cardiovascular-related mortality after intraoperative neurophysiologic monitoring changes during carotid endarterectomy.
- Author
-
Paras, Stephanie, Mina, Amir, Crammond, Donald J., Visweswaran, Shyam, Anetakis, Katherine M., Balzer, Jeffrey R., Shandal, Varun, and Thirumala, Parthasarathy D.
- Subjects
- *
NEUROPHYSIOLOGIC monitoring , *CAROTID endarterectomy , *DISEASE risk factors , *ACADEMIC medical centers , *MORTALITY - Abstract
• Carotid endarterectomy patients with intraoperative neurophysiologic monitoring changes had a doubled rate of long-term cardiovascular-related mortality. • Patients with perioperative stroke showed a four times higher rate of long-term cardiovascular-related mortality. • Intraoperative neurophysiologic monitoring changes are valuable in predicting long-term cardiovascular-related adverse outcomes. We examined significant intraoperative neurophysiologic monitoring (IONM) changes and perioperative stroke as independent risk factors of long-term cardiovascular-related mortality in patients who have undergone carotid endarterectomy (CEA). Records of patients who underwent CEA with IONM at the University of Pittsburgh Medical Center between January 1, 2009 and December 31, 2019 were analyzed retrospectively. Cardiovascular-related mortality was compared between the significant IONM change group and no IONM change group and between the perioperative stroke group and no perioperative stroke group. Our final cohort consisted of 2,090 patients. Patients with significant IONM changes showed nearly twice the rate of cardiovascular-related mortality up to 10 years post-CEA (hazard ratio (HR) = 1.98; 95% confidence interval (CI) [1.20 – 3.26]). Patients with perioperative stroke were four times more likely than patients without perioperative stroke to experience cardiovascular-related mortality (HR = 4.09; 95% CI [2.13 – 7.86]). Among CEA patients who underwent CEA and who experienced significant IONM changes or perioperative stroke, we observed long-term increased and sustained risk of cardiovascular-related mortality. Significant IONM changes are valuable in predicting the risk of long-term outcomes following CEA. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
244. Bayesian network models with decision tree analysis for management of childhood malaria in Malawi.
- Author
-
Taneja, Sanya B., Douglas, Gerald P., Cooper, Gregory F., Michaels, Marian G., Druzdzel, Marek J., and Visweswaran, Shyam
- Subjects
- *
DECISION trees , *DECISION making , *CHILD mortality , *MEDICAL personnel , *MALARIA , *DECISION support systems , *RURAL health - Abstract
Background: Malaria is a major cause of death in children under five years old in low- and middle-income countries such as Malawi. Accurate diagnosis and management of malaria can help reduce the global burden of childhood morbidity and mortality. Trained healthcare workers in rural health centers manage malaria with limited supplies of malarial diagnostic tests and drugs for treatment. A clinical decision support system that integrates predictive models to provide an accurate prediction of malaria based on clinical features could aid healthcare workers in the judicious use of testing and treatment. We developed Bayesian network (BN) models to predict the probability of malaria from clinical features and an illustrative decision tree to model the decision to use or not use a malaria rapid diagnostic test (mRDT).Methods: We developed two BN models to predict malaria from a dataset of outpatient encounters of children in Malawi. The first BN model was created manually with expert knowledge, and the second model was derived using an automated method. The performance of the BN models was compared to other statistical models on a range of performance metrics at multiple thresholds. We developed a decision tree that integrates predictions with the costs of mRDT and a course of recommended treatment.Results: The manually created BN model achieved an area under the ROC curve (AUC) equal to 0.60 which was statistically significantly higher than the other models. At the optimal threshold for classification, the manual BN model had sensitivity and specificity of 0.74 and 0.42 respectively, and the automated BN model had sensitivity and specificity of 0.45 and 0.68 respectively. The balanced accuracy values were similar across all the models. Sensitivity analysis of the decision tree showed that for values of probability of malaria below 0.04 and above 0.40, the preferred decision that minimizes expected costs is not to perform mRDT.Conclusion: In resource-constrained settings, judicious use of mRDT is important. Predictive models in combination with decision analysis can provide personalized guidance on when to use mRDT in the management of childhood malaria. BN models can be efficiently derived from data to support clinical decision making. [ABSTRACT FROM AUTHOR]- Published
- 2021
- Full Text
- View/download PDF
245. Identifying incidental findings from radiology reports of trauma patients: An evaluation of automated feature representation methods.
- Author
-
Trivedi, Gaurav, Hong, Charmgil, Dadashzadeh, Esmaeel R., Handzel, Robert M., Hochheiser, Harry, and Visweswaran, Shyam
- Abstract
Background: Radiologic imaging of trauma patients often uncovers findings that are unrelated to the trauma. These are termed as incidental findings and identifying them in radiology examination reports is necessary for appropriate follow-up. We developed and evaluated an automated pipeline to identify incidental findings at sentence and section levels in radiology reports of trauma patients.Methods: We created an annotated dataset of 4,181 reports and investigated automated feature representations including traditional word and clinical concept (such as SNOMED CT) representations, as well as word and concept embeddings. We evaluated these representations by using them with traditional classifiers such as logistic regression and with deep learning methods such as convolutional neural networks (CNNs).Results: The best performance was observed using word embeddings with CNNs with F1 scores of 0.66 and 0.52 at section and sentence levels respectively. The F1 score was statistically significantly higher for sections compared to sentences (Wilcoxon; Z < 0.001, p < 0.05). Compared to using words alone, the addition of SNOMED CT concepts did not improve performance. At the sentence level, the F1 score improved significantly from 0.46 to 0.52 when using pre-trained embeddings (Wilcoxon; Z < 0.001, p < 0.05).Conclusion: The results show that the best performance was achieved by using embeddings with CNNs at both sentence and section levels. This provides evidence that such a pipeline is capable of accurately identifying incidental findings in radiology reports in an automated manner. [ABSTRACT FROM AUTHOR]- Published
- 2019
- Full Text
- View/download PDF
246. ReDWINE: A clinical datamart with text analytical capabilities to facilitate rehabilitation research.
- Author
-
Oniani, David, Parmanto, Bambang, Saptono, Andi, Bove, Allyn, Freburger, Janet, Visweswaran, Shyam, Cappella, Nickie, McLay, Brian, Silverstein, Jonathan C., Becich, Michael J., Delitto, Anthony, Skidmore, Elizabeth, and Wang, Yanshan
- Published
- 2023
- Full Text
- View/download PDF
247. Precision phenotyping for curating research cohorts of patients with unexplained post-acute sequelae of COVID-19.
- Author
-
Azhir A, Hügel J, Tian J, Cheng J, Bassett IV, Bell DS, Bernstam EV, Farhat MR, Henderson DW, Lau ES, Morris M, Semenov YR, Triant VA, Visweswaran S, Strasser ZH, Klann JG, Murphy SN, and Estiri H
- Abstract
Background: Scalable identification of patients with post-acute sequelae of COVID-19 (PASC) is challenging due to a lack of reproducible precision phenotyping algorithms, which has led to suboptimal accuracy, demographic biases, and underestimation of the PASC., Methods: In a retrospective case-control study, we developed a precision phenotyping algorithm for identifying cohorts of patients with PASC. We used longitudinal electronic health records data from over 295,000 patients from 14 hospitals and 20 community health centers in Massachusetts. The algorithm employs an attention mechanism to simultaneously exclude sequelae that prior conditions can explain and include infection-associated chronic conditions. We performed independent chart reviews to tune and validate the algorithm., Findings: The PASC phenotyping algorithm improves precision and prevalence estimation and reduces bias in identifying PASC cohorts compared to the ICD-10-CM code U09.9. The algorithm identified a cohort of over 24,000 patients with 79.9% precision. Our estimated prevalence of PASC was 22.8%, which is close to the national estimates for the region. We also provide in-depth analyses, encompassing identified lingering effects by organ, comorbidity profiles, and temporal differences in the risk of PASC., Conclusions: PASC precision phenotyping boasts superior precision and prevalence estimation while exhibiting less bias in identifying patients with PASC. The cohort derived from this algorithm will serve as a springboard for delving into the genetic, metabolomic, and clinical intricacies of PASC, surmounting the constraints of prior PASC cohort studies., Funding: This research was funded by the US National Institute of Allergy and Infectious Diseases (NIAID)., Competing Interests: Declaration of interests The authors declare no competing interests., (Copyright © 2024 The Author(s). Published by Elsevier Inc. All rights reserved.)
- Published
- 2024
- Full Text
- View/download PDF
248. Realizing the potential of social determinants data in EHR systems: A scoping review of approaches for screening, linkage, extraction, analysis, and interventions.
- Author
-
Li C, Mowery DL, Ma X, Yang R, Vurgun U, Hwang S, Donnelly HK, Bandhey H, Senathirajah Y, Visweswaran S, Sadhu EM, Akhtar Z, Getzen E, Freda PJ, Long Q, and Becich MJ
- Abstract
Background: Social determinants of health (SDoH), such as socioeconomics and neighborhoods, strongly influence health outcomes. However, the current state of standardized SDoH data in electronic health records (EHRs) is lacking, a significant barrier to research and care quality., Methods: We conducted a PubMed search using "SDOH" and "EHR" Medical Subject Headings terms, analyzing included articles across five domains: 1) SDoH screening and assessment approaches, 2) SDoH data collection and documentation, 3) Use of natural language processing (NLP) for extracting SDoH, 4) SDoH data and health outcomes, and 5) SDoH-driven interventions., Results: Of 685 articles identified, 324 underwent full review. Key findings include implementation of tailored screening instruments, census and claims data linkage for contextual SDoH profiles, NLP systems extracting SDoH from notes, associations between SDoH and healthcare utilization and chronic disease control, and integrated care management programs. However, variability across data sources, tools, and outcomes underscores the need for standardization., Discussion: Despite progress in identifying patient social needs, further development of standards, predictive models, and coordinated interventions is critical for SDoH-EHR integration. Additional database searches could strengthen this scoping review. Ultimately, widespread capture, analysis, and translation of multidimensional SDoH data into clinical care is essential for promoting health equity., Competing Interests: There are no conflicts of interest to declare., (© The Author(s) 2024.)
- Published
- 2024
- Full Text
- View/download PDF
249. Fairness and inclusion methods for biomedical informatics research.
- Author
-
Visweswaran S, Luo Y, and Peleg M
- Abstract
Competing Interests: Declaration of competing interest The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
- Published
- 2024
- Full Text
- View/download PDF
250. Extraction of sleep information from clinical notes of Alzheimer's disease patients using natural language processing.
- Author
-
Sivarajkumar S, Tam TYC, Mohammad HA, Viggiano S, Oniani D, Visweswaran S, and Wang Y
- Subjects
- Humans, Sleep Wake Disorders, Sleep, Electronic Health Records, Aged, Female, Machine Learning, Male, Datasets as Topic, Natural Language Processing, Alzheimer Disease
- Abstract
Objectives: Alzheimer's disease (AD) is the most common form of dementia in the United States. Sleep is one of the lifestyle-related factors that has been shown critical for optimal cognitive function in old age. However, there is a lack of research studying the association between sleep and AD incidence. A major bottleneck for conducting such research is that the traditional way to acquire sleep information is time-consuming, inefficient, non-scalable, and limited to patients' subjective experience. We aim to automate the extraction of specific sleep-related patterns, such as snoring, napping, poor sleep quality, daytime sleepiness, night wakings, other sleep problems, and sleep duration, from clinical notes of AD patients. These sleep patterns are hypothesized to play a role in the incidence of AD, providing insight into the relationship between sleep and AD onset and progression., Materials and Methods: A gold standard dataset is created from manual annotation of 570 randomly sampled clinical note documents from the adSLEEP, a corpus of 192 000 de-identified clinical notes of 7266 AD patients retrieved from the University of Pittsburgh Medical Center (UPMC). We developed a rule-based natural language processing (NLP) algorithm, machine learning models, and large language model (LLM)-based NLP algorithms to automate the extraction of sleep-related concepts, including snoring, napping, sleep problem, bad sleep quality, daytime sleepiness, night wakings, and sleep duration, from the gold standard dataset., Results: The annotated dataset of 482 patients comprised a predominantly White (89.2%), older adult population with an average age of 84.7 years, where females represented 64.1%, and a vast majority were non-Hispanic or Latino (94.6%). Rule-based NLP algorithm achieved the best performance of F1 across all sleep-related concepts. In terms of positive predictive value (PPV), the rule-based NLP algorithm achieved the highest PPV scores for daytime sleepiness (1.00) and sleep duration (1.00), while the machine learning models had the highest PPV for napping (0.95) and bad sleep quality (0.86), and LLAMA2 with finetuning had the highest PPV for night wakings (0.93) and sleep problem (0.89)., Discussion: Although sleep information is infrequently documented in the clinical notes, the proposed rule-based NLP algorithm and LLM-based NLP algorithms still achieved promising results. In comparison, the machine learning-based approaches did not achieve good results, which is due to the small size of sleep information in the training data., Conclusion: The results show that the rule-based NLP algorithm consistently achieved the best performance for all sleep concepts. This study focused on the clinical notes of patients with AD but could be extended to general sleep information extraction for other diseases., (© The Author(s) 2024. Published by Oxford University Press on behalf of the American Medical Informatics Association. All rights reserved. For permissions, please email: journals.permissions@oup.com.)
- Published
- 2024
- Full Text
- View/download PDF
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.