Descriptor: "Factual" - Searchworks@Jio Institute Digital Library Search Results

Your search keyword '"Factual"' showing total 4,226 results

Start Over Descriptor "Factual"

4,226 results on '"Factual"'

1. Comparing penalization methods for linear models on large observational health data.

Author: Fridgeirsson, Egill, Williams, Ross, Rijnbeek, Peter, Suchard, Marc, and Reps, Jenna
Subjects: calibration, discrimination, electronic health records, logistic regression, regularization, Humans, Logistic Models, Depressive Disorder, Major, Electronic Health Records, Linear Models, Databases, Factual, United States
Abstract: OBJECTIVE: This study evaluates regularization variants in logistic regression (L1, L2, ElasticNet, Adaptive L1, Adaptive ElasticNet, Broken adaptive ridge [BAR], and Iterative hard thresholding [IHT]) for discrimination and calibration performance, focusing on both internal and external validation. MATERIALS AND METHODS: We use data from 5 US claims and electronic health record databases and develop models for various outcomes in a major depressive disorder patient population. We externally validate all models in the other databases. We use a train-test split of 75%/25% and evaluate performance with discrimination and calibration. Statistical analysis for difference in performance uses Friedmans test and critical difference diagrams. RESULTS: Of the 840 models we develop, L1 and ElasticNet emerge as superior in both internal and external discrimination, with a notable AUC difference. BAR and IHT show the best internal calibration, without a clear external calibration leader. ElasticNet typically has larger model sizes than L1. Methods like IHT and BAR, while slightly less discriminative, significantly reduce model complexity. CONCLUSION: L1 and ElasticNet offer the best discriminative performance in logistic regression for healthcare predictions, maintaining robustness across validations. For simpler, more interpretable models, L0-based methods (IHT and BAR) are advantageous, providing greater parsimony and calibration with fewer features. This study aids in selecting suitable regularization techniques for healthcare prediction models, balancing performance, complexity, and interpretability.
Published: 2024

2. Comprehensive assessment of TDP-43 neuropathology data in the National Alzheimer’s Coordinating Center database

Author: Woodworth, Davis C, Nguyen, Katelynn M, Sordo, Lorena, Scambray, Kiana A, Head, Elizabeth, Kawas, Claudia H, Corrada, María M, Nelson, Peter T, and Sajjadi, S Ahmad
Subjects: Biomedical and Clinical Sciences, Neurosciences, Alzheimer's Disease, Aging, Rare Diseases, Neurodegenerative, Dementia, Acquired Cognitive Impairment, Alzheimer's Disease Related Dementias (ADRD), Frontotemporal Dementia (FTD), Alzheimer's Disease including Alzheimer's Disease Related Dementias (AD/ADRD), ALS, Brain Disorders, 2.1 Biological and endogenous factors, Neurological, Humans, Female, Aged, Male, Alzheimer Disease, DNA-Binding Proteins, TDP-43 Proteinopathies, Aged, 80 and over, Databases, Factual, Frontotemporal Lobar Degeneration, Brain, Amyotrophic Lateral Sclerosis, Hippocampus, Middle Aged, TDP-43, Limbic predominant age-related TDP-43 encephalopathy neuropathologic change, Frontotemporal lobar degeneration, Amyotrophic lateral sclerosis, Hippocampal sclerosis of aging, Alzheimer's disease, National Alzheimer's coordinating center, Alzheimer’s disease, National Alzheimer’s coordinating center, Clinical Sciences, Neurology & Neurosurgery
Abstract: TDP-43 proteinopathy is a salient neuropathologic feature in a subset of frontotemporal lobar degeneration (FTLD-TDP), in amyotrophic lateral sclerosis (ALS-TDP), and in limbic-predominant age-related TDP-43 encephalopathy neuropathologic change (LATE-NC), and is associated with hippocampal sclerosis of aging (HS-A). We examined TDP-43-related pathology data in the National Alzheimer's Coordinating Center (NACC) in two parts: (I) availability of assessments, and (II) associations with clinical diagnoses and other neuropathologies in those with all TDP-43 measures available. Part I: Of 4326 participants with neuropathology data collected using forms that included TDP-43 assessments, data availability was highest for HS-A (97%) and ALS (94%), followed by FTLD-TDP (83%). Regional TDP-43 pathologic assessment was available for 77% of participants, with hippocampus the most common region. Availability for the TDP-43-related measures increased over time, and was higher in centers with high proportions of participants with clinical FTLD. Part II: In 2142 participants with all TDP-43-related assessments available, 27% of participants had LATE-NC, whereas ALS-TDP or FTLD-TDP (ALS/FTLD-TDP) was present in 9% of participants, and 2% of participants had TDP-43 related to other pathologies ("Other TDP-43"). HS-A was present in 14% of participants, of whom 55% had LATE-NC, 20% ASL/FTLD-TDP, 3% Other TDP-43, and 23% no TDP-43. LATE-NC, ALS/FTLD-TDP, and Other TDP-43, were each associated with higher odds of dementia, HS-A, and hippocampal atrophy, compared to those without TDP-43 pathology. LATE-NC was associated with higher odds for Alzheimer's disease (AD) clinical diagnosis, AD neuropathologic change (ADNC), Lewy bodies, arteriolosclerosis, and cortical atrophy. ALS/FTLD-TDP was associated with higher odds of clinical diagnoses of primary progressive aphasia and behavioral-variant frontotemporal dementia, and cortical/frontotemporal lobar atrophy. When using NACC data for TDP-43-related analyses, researchers should carefully consider the incomplete availability of the different regional TDP-43 assessments, the high frequency of participants with ALS/FTLD-TDP, and the presence of other forms of TDP-43 pathology.
Published: 2024

3. Temporal dynamics of the multi-omic response to endurance exercise training

Author: Bae, Dam, Dasari, Surendra, Dennis, Courtney, Evans, Charles R, Gaul, David A, Ilkayeva, Olga, Ivanova, Anna A, Kachman, Maureen T, Keshishian, Hasmik, Lanza, Ian R, Lira, Ana C, Muehlbauer, Michael J, Nair, Venugopalan D, Piehowski, Paul D, Rooney, Jessica L, Smith, Kevin S, Stowe, Cynthia L, Zhao, Bingqing, Clark, Natalie M, Jimenez-Morales, David, Lindholm, Malene E, Many, Gina M, Sanford, James A, Smith, Gregory R, Vetr, Nikolai G, Zhang, Tiantian, Almagro Armenteros, Jose J, Avila-Pacheco, Julian, Bararpour, Nasim, Ge, Yongchao, Hou, Zhenxin, Marwaha, Shruti, Presby, David M, Natarajan Raja, Archana, Savage, Evan M, Steep, Alec, Sun, Yifei, Wu, Si, Zhen, Jimmy, Bodine, Sue C, Esser, Karyn A, Goodyear, Laurie J, Schenk, Simon, Montgomery, Stephen B, Fernández, Facundo M, Sealfon, Stuart C, Snyder, Michael P, Adkins, Joshua N, Ashley, Euan, Burant, Charles F, Carr, Steven A, Clish, Clary B, Cutter, Gary, Gerszten, Robert E, Kraus, William E, Li, Jun Z, Miller, Michael E, Nair, K Sreekumaran, Newgard, Christopher, Ortlund, Eric A, Qian, Wei-Jun, Tracy, Russell, Walsh, Martin J, Wheeler, Matthew T, Dalton, Karen P, Hastie, Trevor, Hershman, Steven G, Samdarshi, Mihir, Teng, Christopher, Tibshirani, Rob, Cornell, Elaine, Gagne, Nicole, May, Sandy, Bouverat, Brian, Leeuwenburgh, Christiaan, Lu, Ching-ju, Pahor, Marco, Hsu, Fang-Chi, Rushing, Scott, Walkup, Michael P, Nicklas, Barbara, Rejeski, W Jack, Williams, John P, Xia, Ashley, Albertson, Brent G, Barton, Elisabeth R, Booth, Frank W, Caputo, Tiziana, Cicha, Michael, De Sousa, Luis Gustavo Oliveira, Farrar, Roger, Hevener, Andrea L, Hirshman, Michael F, Jackson, Bailey E, Ke, Benjamin G, Kramer, Kyle S, Lessard, Sarah J, Makarewicz, Nathan S, Marshall, Andrea G, and Nigro, Pasquale
Subjects: Health Sciences, Sports Science and Exercise, Prevention, Human Genome, Cardiovascular, Behavioral and Social Science, Genetics, Physical Activity, 2.1 Biological and endogenous factors, 1.1 Normal biological development and functioning, Generic health relevance, Inflammatory and immune system, Good Health and Well Being, Animals, Female, Humans, Male, Rats, Acetylation, Blood, Cardiovascular Diseases, Databases, Factual, Endurance Training, Epigenome, Inflammatory Bowel Diseases, Internet, Lipidomics, Metabolome, Mitochondria, Multiomics, Non-alcoholic Fatty Liver Disease, Organ Specificity, Phosphorylation, Physical Conditioning, Animal, Physical Endurance, Proteome, Proteomics, Time Factors, Transcriptome, Ubiquitination, Wounds and Injuries, MoTrPAC Study Group, Lead Analysts, MoTrPAC Study Group, General Science & Technology
Abstract: Regular exercise promotes whole-body health and prevents disease, but the underlying molecular mechanisms are incompletely understood1-3. Here, the Molecular Transducers of Physical Activity Consortium4 profiled the temporal transcriptome, proteome, metabolome, lipidome, phosphoproteome, acetylproteome, ubiquitylproteome, epigenome and immunome in whole blood, plasma and 18 solid tissues in male and female Rattus norvegicus over eight weeks of endurance exercise training. The resulting data compendium encompasses 9,466 assays across 19 tissues, 25 molecular platforms and 4 training time points. Thousands of shared and tissue-specific molecular alterations were identified, with sex differences found in multiple tissues. Temporal multi-omic and multi-tissue analyses revealed expansive biological insights into the adaptive responses to endurance training, including widespread regulation of immune, metabolic, stress response and mitochondrial pathways. Many changes were relevant to human health, including non-alcoholic fatty liver disease, inflammatory bowel disease, cardiovascular health and tissue injury and recovery. The data and analyses presented in this study will serve as valuable resources for understanding and exploring the multi-tissue molecular effects of endurance training and are provided in a public repository ( https://motrpac-data.org/ ).
Published: 2024

4. The effect of SARS-COV-2 variant on non-respiratory features and mortality among vaccinated and non-fully vaccinated patients

Author: Cotton, Shannon A, Subramanian, Ajan, Hughes, Thomas D, Huang, Yong, Sierra, Carmen Josefa, Pearce, Alex K, Malhotra, Atul, Rahmani, Amir M, Downs, Charles A, and Pinto, Melissa D
Subjects: Biomedical and Clinical Sciences, Clinical Sciences, Prevention, Emerging Infectious Diseases, Vaccine Related, Immunization, Coronaviruses, Infectious Diseases, Infection, Good Health and Well Being, Humans, SARS-CoV-2, COVID-19, Databases, Factual, Fever, Vaccination, Organ dysfunction, Symptoms, COVID-19 mortality, Non-respiratory, Biological Sciences, Agricultural and Veterinary Sciences, Medical and Health Sciences, Virology, Biomedical and clinical sciences, Health sciences
Abstract: ObjectiveTo determine the effect of SARS-CoV-2 variants on non-respiratory features of COVID-19 in vaccinated and not fully vaccinated patients using a University of California database.MethodsA longitudinal retrospective review of medical records (n = 63,454) from 1/1/2020-4/26/2022 using the UCCORDS database was performed to compare non-respiratory features, vaccination status, and mortality between variants. Chi-square tests were used to study the relationship between categorical variables using a contingency matrix.ResultsFever was the most common feature across all variants. Fever was significantly higher in not fully vaccinated during the Delta and Omicron waves (p = 0.001; p = 0.001). Cardiac features were statistically higher in not fully vaccinated during Omicron; tachycardia was only a feature of not fully vaccinated during Delta and Omicron; diabetes and GI reflux were features of all variants regardless of vaccine status. Odds of death were significantly increased among those not fully vaccinated in the Delta and Omicron variants (Delta OR: 1.64, p = 0.052; Omicron OR: 1.96, p
Published: 2024

5. PMechDB: A Public Database of Elementary Polar Reaction Steps

Author: Tavakoli, Mohammadamin, Miller, Ryan J, Angel, Mirana Claire, Pfeiffer, Michael A, Gutman, Eugene S, Mood, Aaron D, Van Vranken, David, and Baldi, Pierre
Subjects: Medicinal and Biomolecular Chemistry, Chemical Sciences, Networking and Information Technology R&D (NITRD), Machine Learning and Artificial Intelligence, Databases, Factual, Databases, Chemical, Theoretical and Computational Chemistry, Computation Theory and Mathematics, Medicinal & Biomolecular Chemistry, Medicinal and biomolecular chemistry, Theoretical and computational chemistry
Abstract: Most online chemical reaction databases are not publicly accessible or are fully downloadable. These databases tend to contain reactions in noncanonicalized formats and often lack comprehensive information regarding reaction pathways, intermediates, and byproducts. Within the few publicly available databases, reactions are typically stored in the form of unbalanced, overall transformations with minimal interpretability of the underlying chemistry. These limitations present significant obstacles to data-driven applications including the development of machine learning models. As an effort to overcome these challenges, we introduce PMechDB, a publicly accessible platform designed to curate, aggregate, and share polar chemical reaction data in the form of elementary reaction steps. Our initial version of PMechDB consists of over 100,000 such steps. In the PMechDB, all reactions are stored as canonicalized and balanced elementary steps, featuring accurate atom mapping and arrow-pushing mechanisms. As an online interactive database, PMechDB provides multiple interfaces that enable users to search, download, and upload chemical reactions. We anticipate that the public availability of PMechDB and its standardized data representation will prove beneficial for chemoinformatics research and education and the development of data-driven, interpretable models for predicting reactions and pathways. PMechDB platform is accessible online at https://deeprxn.ics.uci.edu/pmechdb.
Published: 2024

6. Structured Prompt Interrogation and Recursive Extraction of Semantics (SPIRES): a method for populating knowledge bases using zero-shot learning

Author: Caufield, J Harry, Hegde, Harshad, Emonet, Vincent, Harris, Nomi L, Joachimiak, Marcin P, Matentzoglu, Nicolas, Kim, HyeongSik, Moxon, Sierra, Reese, Justin T, Haendel, Melissa A, Robinson, Peter N, and Mungall, Christopher J
Subjects: Data Management and Data Science, Information and Computing Sciences, Networking and Information Technology R&D (NITRD), Generic health relevance, Semantics, Knowledge Bases, Databases, Factual, Mathematical Sciences, Biological Sciences, Bioinformatics, Biological sciences, Information and computing sciences, Mathematical sciences
Abstract: MotivationCreating knowledge bases and ontologies is a time consuming task that relies on manual curation. AI/NLP approaches can assist expert curators in populating these knowledge bases, but current approaches rely on extensive training data, and are not able to populate arbitrarily complex nested knowledge schemas.ResultsHere we present Structured Prompt Interrogation and Recursive Extraction of Semantics (SPIRES), a Knowledge Extraction approach that relies on the ability of Large Language Models (LLMs) to perform zero-shot learning and general-purpose query answering from flexible prompts and return information conforming to a specified schema. Given a detailed, user-defined knowledge schema and an input text, SPIRES recursively performs prompt interrogation against an LLM to obtain a set of responses matching the provided schema. SPIRES uses existing ontologies and vocabularies to provide identifiers for matched elements. We present examples of applying SPIRES in different domains, including extraction of food recipes, multi-species cellular signaling pathways, disease treatments, multi-step drug mechanisms, and chemical to disease relationships. Current SPIRES accuracy is comparable to the mid-range of existing Relation Extraction methods, but greatly surpasses an LLM's native capability of grounding entities with unique identifiers. SPIRES has the advantage of easy customization, flexibility, and, crucially, the ability to perform new tasks in the absence of any new training data. This method supports a general strategy of leveraging the language interpreting capabilities of LLMs to assemble knowledge bases, assisting manual knowledge curation and acquisition while supporting validation with publicly-available databases and ontologies external to the LLM.Availability and implementationSPIRES is available as part of the open source OntoGPT package: https://github.com/monarch-initiative/ontogpt.
Published: 2024

7. microbeMASST: a taxonomically informed mass spectrometry search tool for microbial metabolomics data

Author: Zuffa, Simone, Schmid, Robin, Bauermeister, Anelize, P. Gomes, Paulo Wender, Caraballo-Rodriguez, Andres M, El Abiead, Yasin, Aron, Allegra T, Gentry, Emily C, Zemlin, Jasmine, Meehan, Michael J, Avalon, Nicole E, Cichewicz, Robert H, Buzun, Ekaterina, Terrazas, Marvic Carrillo, Hsu, Chia-Yun, Oles, Renee, Ayala, Adriana Vasquez, Zhao, Jiaqi, Chu, Hiutung, Kuijpers, Mirte CM, Jackrel, Sara L, Tugizimana, Fidele, Nephali, Lerato Pertunia, Dubery, Ian A, Madala, Ntakadzeni Edwin, Moreira, Eduarda Antunes, Costa-Lotufo, Leticia Veras, Lopes, Norberto Peporine, Rezende-Teixeira, Paula, Jimenez, Paula C, Rimal, Bipin, Patterson, Andrew D, Traxler, Matthew F, Pessotti, Rita de Cassia, Alvarado-Villalobos, Daniel, Tamayo-Castillo, Giselle, Chaverri, Priscila, Escudero-Leyva, Efrain, Quiros-Guerrero, Luis-Manuel, Bory, Alexandre Jean, Joubert, Juliette, Rutz, Adriano, Wolfender, Jean-Luc, Allard, Pierre-Marie, Sichert, Andreas, Pontrelli, Sammy, Pullman, Benjamin S, Bandeira, Nuno, Gerwick, William H, Gindro, Katia, Massana-Codina, Josep, Wagner, Berenike C, Forchhammer, Karl, Petras, Daniel, Aiosa, Nicole, Garg, Neha, Liebeke, Manuel, Bourceau, Patric, Kang, Kyo Bin, Gadhavi, Henna, de Carvalho, Luiz Pedro Sorio, Silva dos Santos, Mariana, Pérez-Lorente, Alicia Isabel, Molina-Santiago, Carlos, Romero, Diego, Franke, Raimo, Brönstrup, Mark, Vera Ponce de León, Arturo, Pope, Phillip Byron, La Rosa, Sabina Leanti, La Barbera, Giorgia, Roager, Henrik M, Laursen, Martin Frederik, Hammerle, Fabian, Siewert, Bianka, Peintner, Ursula, Licona-Cassani, Cuauhtemoc, Rodriguez-Orduña, Lorena, Rampler, Evelyn, Hildebrand, Felina, Koellensperger, Gunda, Schoeny, Harald, Hohenwallner, Katharina, Panzenboeck, Lisa, Gregor, Rachel, O’Neill, Ellis Charles, Roxborough, Eve Tallulah, Odoi, Jane, Bale, Nicole J, Ding, Su, Sinninghe Damsté, Jaap S, Guan, Xue Li, Cui, Jerry J, Ju, Kou-San, Silva, Denise Brentan, Silva, Fernanda Motta Ribeiro, da Silva, Gilvan Ferreira, Koolen, Hector HF, Grundmann, Carlismari, and Clement, Jason A
Subjects: Microbiology, Biological Sciences, Good Health and Well Being, Humans, Tandem Mass Spectrometry, Metabolomics, Databases, Factual, Medical Microbiology
Abstract: microbeMASST, a taxonomically informed mass spectrometry (MS) search tool, tackles limited microbial metabolite annotation in untargeted metabolomics experiments. Leveraging a curated database of >60,000 microbial monocultures, users can search known and unknown MS/MS spectra and link them to their respective microbial producers via MS/MS fragmentation patterns. Identification of microbe-derived metabolites and relative producers without a priori knowledge will vastly enhance the understanding of microorganisms' role in ecology and human health.
Published: 2024

8. Complications of Novel Radiofrequency Device Use in Rhinology: A MAUDE Analysis

Author: Torabi, Sina J, Bitner, Benjamin F, Abello, Eric H, Nguyen, Theodore V, Wong, Brian JF, and Kuan, Edward C
Subjects: Biomedical and Clinical Sciences, Clinical Sciences, Pediatric, Bioengineering, Patient Safety, Humans, Child, United States, Necrosis, Databases, Factual, United States Food and Drug Administration, chronic rhinitis, complications, epistaxis, radiofrequency devices, rhinorrhea, Otorhinolaryngology, Clinical sciences
Abstract: With the widespread adoption of intranasal radiofrequency (RF) devices, our objective was to report national adverse events (AEs) associated with their use. The Food and Drug Administration's Manufacturer and User Facility Device Experience was queried. A total of 24 device-related AEs were reported, 11 (45.8%) for Celon® (Olympus), 3 (12.5%) for Vivaer® (Aerin), 2 (8.3%) for Neuromark® (Neurent), and 8 (33.3%) for Rhinaer® (Aerin). Seven (63.6%) of the Celon®-related complications were related to tissue necrosis (largely user error-related), but 1 (9.1%) episode of pediatric ocular palsy was also reported. Vivaer® complications included synechiae formation, a mucosal perforation, and a case of empty nose syndrome. Of the posterior nasal nerve ablating devices, 9 of 10 AEs were epistaxes, of which 7 (77.8%) required operative intervention. Surgeons should exercise vigilance and tissue-appropriate device settings when utilizing RF devices. Epistaxis and tissue necrosis may occur, as well as more rare, but devastating, complications.
Published: 2024

9. IDSL.GOA: gene ontology analysis for interpreting metabolomic datasets.

Author: Mahajan, Priyanka, Fiehn, Oliver, and Barupal, Dinesh
Subjects: Female, Humans, Gene Ontology, Metabolomics, Proteins, Databases, Factual, Computational Biology
Abstract: Biological interpretation of metabolomic datasets often ends at a pathway analysis step to find the over-represented metabolic pathways in the list of statistically significant metabolites. However, definitions of biochemical pathways and metabolite coverage vary among different curated databases, leading to missed interpretations. For the lists of genes, transcripts and proteins, Gene Ontology (GO) terms over-presentation analysis has become a standardized approach for biological interpretation. But, GO analysis has not been achieved for metabolomic datasets. We present a new knowledgebase (KB) and the online tool, Gene Ontology Analysis by the Integrated Data Science Laboratory for Metabolomics and Exposomics (IDSL.GOA) to conduct GO over-representation analysis for a metabolite list. The IDSL.GOA KB covers 2393 metabolic GO terms and associated 3144 genes, 1,492 EC annotations, and 2621 metabolites. IDSL.GOA analysis of a case study of older versus young female brain cortex metabolome highlighted 82 GO terms being significantly overrepresented (FDR
Published: 2024

10. LIPID MAPS: update to databases and tools for the lipidomics community.

Author: Conroy, Matthew, Andrews, Robert, Andrews, Simon, Cockayne, Lauren, Dennis, Edward, Fahy, Eoin, Gaud, Caroline, Griffiths, William, Jukes, Geoff, Kolchin, Maksim, Mendivelso, Karla, Lopez-Clavijo, Andrea, Ready, Caroline, Subramaniam, Shankar, and ODonnell, Valerie
Subjects: Databases, Factual, Lipid Metabolism, Lipidomics, Lipids, Software
Abstract: LIPID MAPS (LIPID Metabolites and Pathways Strategy), www.lipidmaps.org, provides a systematic and standardized approach to organizing lipid structural and biochemical data. Founded 20 years ago, the LIPID MAPS nomenclature and classification has become the accepted community standard. LIPID MAPS provides databases for cataloging and identifying lipids at varying levels of characterization in addition to numerous software tools and educational resources, and became an ELIXIR-UK data resource in 2020. This paper describes the expansion of existing databases in LIPID MAPS, including richer metadata with literature provenance, taxonomic data and improved interoperability to facilitate FAIR compliance. A joint project funded by ELIXIR-UK, in collaboration with WikiPathways, curates and hosts pathway data, and annotates lipids in the context of their biochemical pathways. Updated features of the search infrastructure are described along with implementation of programmatic access via API and SPARQL. New lipid-specific databases have been developed and provision of lipidomics tools to the community has been updated. Training and engagement have been expanded with webinars, podcasts and an online training school.
Published: 2024

11. The Monarch Initiative in 2024: an analytic platform integrating phenotypes, genes and diseases across species

Author: Putman, Tim E, Schaper, Kevin, Matentzoglu, Nicolas, Rubinetti, Vincent P, Alquaddoomi, Faisal S, Cox, Corey, Caufield, J Harry, Elsarboukh, Glass, Gehrke, Sarah, Hegde, Harshad, Reese, Justin T, Braun, Ian, Bruskiewich, Richard M, Cappelletti, Luca, Carbon, Seth, Caron, Anita R, Chan, Lauren E, Chute, Christopher G, Cortes, Katherina G, De Souza, Vinícius, Fontana, Tommaso, Harris, Nomi L, Hartley, Emily L, Hurwitz, Eric, Jacobsen, Julius OB, Krishnamurthy, Madan, Laraway, Bryan J, McLaughlin, James A, McMurry, Julie A, Moxon, Sierra AT, Mullen, Kathleen R, O’Neil, Shawn T, Shefchek, Kent A, Stefancsik, Ray, Toro, Sabrina, Vasilevsky, Nicole A, Walls, Ramona L, Whetzel, Patricia L, Osumi-Sutherland, David, Smedley, Damian, Robinson, Peter N, Mungall, Christopher J, Haendel, Melissa A, and Munoz-Torres, Monica C
Subjects: Biological Sciences, Genetics, Networking and Information Technology R&D (NITRD), Data Science, Good Health and Well Being, Humans, Phenotype, Internet, Databases, Factual, Software, Genes, Disease, Environmental Sciences, Information and Computing Sciences, Developmental Biology, Biological sciences, Chemical sciences, Environmental sciences
Abstract: Bridging the gap between genetic variations, environmental determinants, and phenotypic outcomes is critical for supporting clinical diagnosis and understanding mechanisms of diseases. It requires integrating open data at a global scale. The Monarch Initiative advances these goals by developing open ontologies, semantic data models, and knowledge graphs for translational research. The Monarch App is an integrated platform combining data about genes, phenotypes, and diseases across species. Monarch's APIs enable access to carefully curated datasets and advanced analysis tools that support the understanding and diagnosis of disease for diverse applications such as variant prioritization, deep phenotyping, and patient profile-matching. We have migrated our system into a scalable, cloud-based infrastructure; simplified Monarch's data ingestion and knowledge graph integration systems; enhanced data mapping and integration standards; and developed a new user interface with novel search and graph navigation features. Furthermore, we advanced Monarch's analytic tools by developing a customized plugin for OpenAI's ChatGPT to increase the reliability of its responses about phenotypic data, allowing us to interrogate the knowledge in the Monarch graph using state-of-the-art Large Language Models. The resources of the Monarch Initiative can be found at monarchinitiative.org and its corresponding code repository at github.com/monarch-initiative/monarch-app.
Published: 2024

12. NMPFamsDB: a database of novel protein families from microbial metagenomes and metatranscriptomes

Author: Baltoumas, Fotis A, Karatzas, Evangelos, Liu, Sirui, Ovchinnikov, Sergey, Sofianatos, Yorgos, Chen, I-Min, Kyrpides, Nikos C, and Pavlopoulos, Georgios A
Subjects: Biochemistry and Cell Biology, Bioinformatics and Computational Biology, Biological Sciences, Amino Acid Sequence, Databases, Factual, Databases, Protein, Ecosystem, Metagenome, Proteins, Geography, Environmental Sciences, Information and Computing Sciences, Developmental Biology, Biological sciences, Chemical sciences, Environmental sciences
Abstract: The Novel Metagenome Protein Families Database (NMPFamsDB) is a database of metagenome- and metatranscriptome-derived protein families, whose members have no hits to proteins of reference genomes or Pfam domains. Each protein family is accompanied by multiple sequence alignments, Hidden Markov Models, taxonomic information, ecosystem and geolocation metadata, sequence and structure predictions, as well as 3D structure models predicted with AlphaFold2. In its current version, NMPFamsDB hosts over 100 000 protein families, each with at least 100 members. The reported protein families significantly expand (more than double) the number of known protein sequence clusters from reference genomes and reveal new insights into their habitat distribution, origins, functions and taxonomy. We expect NMPFamsDB to be a valuable resource for microbial proteome-wide analyses and for further discovery and characterization of novel functions. NMPFamsDB is publicly available in http://www.nmpfamsdb.org/ or https://bib.fleming.gr/NMPFamsDB.
Published: 2024

13. Real-World Complications of the SpaceOAR Hydrogel Spacer: A Review of the Manufacturer and User Facility Device Experience Database

Author: Fernandez, Adrian M, Jones, Charles P, Patel, Hiren V, Ghaffar, Umar, Hakam, Nizar, Li, Kevin D, Nabavizadeh, Behnam, and Breyer, Benjamin N
Subjects: Biomedical and Clinical Sciences, Clinical Sciences, Infectious Diseases, Patient Safety, Bioengineering, Humans, United States, Hydrogels, Intestines, Databases, Factual, United States Food and Drug Administration, prostate cancer include active surveillance, surgery, tar, dysfunction, urethral stricture disease, rectourethral fis, Urology & Nephrology, Clinical sciences
Abstract: ObjectiveTo characterize adverse events related to use of the perirectal spacing agent SpaceOAR, we examined the Manufacturer and User Facility Device Experience (MAUDE) database.MethodsThe MAUDE database was queried for "SpaceOAR" and "Augmenix" from June 2015 (when SpaceOAR was approved by the Food and Drug Administration) to October 2022. Reports were reviewed for adverse events (AEs), operative procedures performed because of the AE, and changes to the radiation plan. AEs were categorized using Common Terminology Criteria for Adverse Events (CTCAE), version 5.0.ResultsSix hundred fifty-four reports were reviewed. Eighty-four were excluded and 4 reports reviewed 2 separate cases of SpaceOAR administration. Five hundred seventy-four cases were ultimately included. Three deaths were reported (0.5% of all AEs). One point six percent of cases represented CTCAE grade 4 injuries (life-threatening consequences; urgent intervention indicated), 15.9% grade 3 (severe but not immediately life-threatening; hospitalization), 24.2% grade 2 (moderate; local/noninvasive intervention), and 57% of events were CTCAE grade 1 (mild; asymptomatic or mild symptoms). Bowel diversion occurred in 29 cases (9%).ConclusionBoth asymptomatic (n = 311) and debilitating (n = 12) complications of SpaceOAR hydrogel use were identified. Death, gel embolization, anaphylaxis, rectal ulcerations, and infections requiring bowel or urinary diversions were among the complications reviewed. Providers should consider these potential complications before perirectal spacer administration and during patient counseling.
Published: 2024

14. Outcomes of Kidney Transplants From Toxoplasma-Positive Donors: An Organ Procurement and Transplant Network Database Analysis.

Author: Butani, Lavjay and Tancredi, Daniel
Subjects: infection, kidney, outcomes, survival analysis, toxoplasma, Humans, Kidney Transplantation, Toxoplasma, Male, Graft Survival, Toxoplasmosis, Tissue Donors, Female, Middle Aged, Tissue and Organ Procurement, Adult, Databases, Factual, Graft Rejection, Treatment Outcome, Antibodies, Protozoan
Abstract: There is a need to reconsider the acceptance of organs from donors considered suboptimal, in the absence of data. Toxoplasma antibody-positive donors (TPD) constitute one such group. The objective of our study was to compare graft survival in deceased donor renal transplant (Tx) recipients, stratified by Toxoplasma IgG status, using the Organ Procurement and Transplantation Network (OPTN) database. A log-linear event history regression model for graft failure categorized by Toxoplasma IgG status, adjusting for confounders was applied to first kidney-only Tx recipients from 2018 to 2022. Of the 51,422 Tx, 4,317 (8.4%) were from TPD. Acute rejection and graft failure (5% each) were similar between groups. Crude graft failure was 7.3 failures per 100 person-years for TPD recipients compared to 6.5 failures per 100 person-years for the Toxoplasma-negative group (p 0.008). The crude failure rate ratio was 1.14 with an adjusted hazard rate ratio of 1.04 (95% CI: 0.94, 1.15, p 0.39). In renal Tx recipients, TPD graft recipients have comparable survival to Tx from Toxoplasma-negative recipients. While caution and close monitoring of recipients post-Tx for surveillance of disseminated toxoplasmosis are still warranted, our study suggests that patients can be successfully managed using TPD organs.
Published: 2024

15. Understanding the value of curation: A survey of US data repository curation practices and perceptions.

Author: Johnston, Lisa, Curty, Renata, Braxton, Susan, Carlson, Jake, Hadley, Hannah, Lafferty-Hess, Sophia, Luong, Hoa, Petters, Jonathan, and Kozlowski, Wendy
Subjects: Humans, Data Curation, Surveys and Questionnaires, United States, Information Dissemination, Data Accuracy, Databases, Factual, Reproducibility of Results
Abstract: Data curators play an important role in assessing data quality and take actions that may ultimately lead to better, more valuable data products. This study explores the curation practices of data curators working within US-based data repositories. We performed a survey in January 2021 to benchmark the levels of curation performed by repositories and assess the perceived value and impact of curation on the data sharing process. Our analysis included 95 responses from 59 unique data repositories. Respondents primarily were professionals working within repositories and examined curation performed within a repository setting. A majority 72.6% of respondents reported that data-level curation was performed by their repository and around half reported their repository took steps to ensure interoperability and reproducibility of their repositorys datasets. Curation actions most frequently reported include checking for duplicate files, reviewing documentation, reviewing metadata, minting persistent identifiers, and checking for corrupt/broken files. The most value-add curation action across generalist, institutional, and disciplinary repository respondents was related to reviewing and enhancing documentation. Respondents reported high perceived impact of curation by their repositories on specific data sharing outcomes including usability, findability, understandability, and accessibility of deposited datasets; respondents associated with disciplinary repositories tended to perceive higher impact on most outcomes. Most survey participants strongly agreed that data curation by the repository adds value to the data sharing process and that it outweighs the effort and cost. We found some differences between institutional and disciplinary repositories, both in the reported frequency of specific curation actions as well as the perceived impact of data curation. Interestingly, we also found variation in the perceptions of those working within the same repository regarding the level and frequency of curation actions performed, which exemplifies the complexity of a repository curation work. Our results suggest data curation may be better understood in terms of specific curation actions and outcomes than broadly defined curation levels and that more research is needed to understand the resource implications of performing these activities. We share these results to provide a more nuanced view of curation, and how curation impacts the broader data lifecycle and data sharing behaviors.
Published: 2024

16. Application of the Sepsis-3 criteria to describe sepsis epidemiology in the Amsterdam UMCdb intensive care dataset.

Author: Williams, Christopher, Edinburgh, Tom, Elbers, Paul, Thoral, Patrick, and Ercole, Ari
Subjects: Humans, Netherlands, Male, Female, Middle Aged, Sepsis, Aged, Intensive Care Units, Organ Dysfunction Scores, Adult, Hospital Mortality, Databases, Factual, Shock, Septic, Critical Care, Anti-Bacterial Agents
Abstract: INTRODUCTION: Sepsis is a major cause of morbidity and mortality worldwide. In the updated, 2016 Sepsis-3 criteria, sepsis is defined as life-threatening organ dysfunction caused by a dysregulated host response to infection, where organ dysfunction can be represented by an increase in the Sequential Organ Failure Assessment (SOFA) score of 2 points or more. We sought to apply the Sepsis-3 criteria to characterise the septic cohort in the Amsterdam University Medical Centres database (Amsterdam UMCdb). METHODS: We examined adult intensive care unit (ICU) admissions in the Amsterdam UMCdb, which contains de-identified data for patients admitted to a mixed surgical-medical ICU at a tertiary academic medical centre in the Netherlands. We operationalised the Sepsis-3 criteria, defining organ dysfunction as an increase in the SOFA score of 2 points or more, while infection was defined as a new course of antibiotics or an escalation in antibiotic therapy, with at least one antibiotic given intravenously. Patients with sepsis were determined to be in septic shock if they additionally required the use of vasopressors and had a lactate level >2 mmol/L. RESULTS: We identified 18,221 ICU admissions from 16,408 patients in our cohort. There were 6,312 unique sepsis episodes, of which 30.2% met the criteria for septic shock. A total of 4,911/6,312 sepsis (77.8%) episodes occurred on ICU admission. Forty-seven percent of emergency medical admissions and 36.7% of emergency surgical admissions were for sepsis. Overall, there was a 12.5% ICU mortality rate; patients with septic shock had a higher ICU mortality rate (38.4%) than those without shock (11.4%). CONCLUSIONS: We successfully operationalised the Sepsis-3 criteria to the Amsterdam UMCdb, allowing the characterization and comparison of sepsis epidemiology across different centres.
Published: 2024

17. Efficacy and Safety of IDegAsp in a Real-World Korean Population with Type 2 Diabetes Mellitus.

Author: Shinae Kang, Yu-Bae Ahn, Tae Keun Oh, Won-Young Lee, Sung Wan Chun, Boram Bae, Amine Dahaoui, Jin Sook Jeong, Sungeun Jung, and Hak Chul Jang
Subjects: *TYPE 2 diabetes, *GLYCEMIC control, *INSULIN aspart, *GLYCOSYLATED hemoglobin, *DIABETES
Abstract: Background: This study investigated the real-world efficacy and safety of insulin degludec/insulin aspart (IDegAsp) in Korean adults with type 2 diabetes mellitus (T2DM), whose insulin treatment was switched to IDegAsp. Methods: This was a multicenter, retrospective, observational study comprising two 26-week treatment periods, before and after switching to IDegAsp, respectively. Korean adults with uncontrolled T2DM treated with basal or premix insulin (±oral antidiabetic drugs) were enrolled. The primary objective was to compare the degree of glycosylated hemoglobin (HbA1c) change in each 26-week observation period. The analyses included changes in HbA1c, fasting plasma glucose (FPG), body weight, proportion of participants achieving HbA1c <7.0%, hypoglycemic events, and total daily insulin dose (ClinicalTrials.gov, number NCT04656106). Results: In total, 196 adults (mean age, 65.95 years; mean T2DM duration, 18.99 years) were analyzed. The change in both HbA1c and FPG were significantly different between the pre-switching and the post-switching period (0.28% vs. -0.51%, P<0.001; 5.21 mg/dL vs. -23.10 mg/dL, P=0.005), respectively. After switching, the rate of achieving HbA1c <7.0% was significantly improved (5.10% at baseline vs. 11.22% with IDegAsp, P=0.012). No significant differences (before vs. after switching) were observed in body weight change, and total daily insulin dose. The rates of overall and severe hypoglycemia were similar in the two periods. Conclusion: In real-world clinical practice in Korea, the change of insulin regimen to IDegAsp was associated with an improvement in glycemic control without increase of hypoglycemia, supporting the use of IDegAsp for patients with T2DM uncontrolled with basal or premix insulin. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

18. The Concise Guide to PHARMACOLOGY 2023/24: Ion channels.

Author: Alexander, Stephen, Mathie, Alistair, Peters, John, Veale, Emma, Striessnig, Jörg, Kelly, Eamonn, Armstrong, Jane, Faccenda, Elena, Harding, Simon, Davies, Jamie, Aldrich, Richard, Attali, Bernard, Baggetta, Austin, Becirovic, Elvir, Biel, Martin, Bill, Roslyn, Caceres, Ana, Catterall, William, Conner, Alex, Davies, Paul, De Clerq, Katrien, Delling, Markus, Di Virgilio, Francesco, Falzoni, Simonetta, Fenske, Stefanie, Fortuny-Gomez, Anna, Fountain, Samuel, George, Chandy, Goldstein, Steve, Grimm, Christian, Grissmer, Stephan, Ha, Kotdaji, Hammelmann, Verena, Hanukoglu, Israel, Hu, Meiqin, Ijzerman, Ad, Jabba, Sairam, Jarvis, Mike, Jensen, Anders, Jordt, Sven, Kaczmarek, Leonard, Kellenberger, Stephan, Kennedy, Charles, King, Brian, Kitchen, Philip, Liu, Qiang, Lynch, Joseph, Meades, Jessica, Mehlfeld, Verena, Nicke, Annette, Offermanns, Stefan, Perez-Reyes, Edward, Plant, Leigh, Rash, Lachlan, Ren, Dejian, Salman, Mootaz, Sieghart, Werner, Sivilotti, Lucia, Smart, Trevor, Snutch, Terrance, Tian, Jinbin, Trimmer, James, Van den Eynde, Charlotte, Vriens, Joris, Wei, Aguan, Winn, Brenda, Wulff, Heike, Xu, Haoxing, Yang, Fan, Fang, Wei, Yue, Lixia, Zhang, Xiaoli, and Zhu, Michael
Subjects: Humans, Databases, Pharmaceutical, Ion Channels, Ligands, Receptors, G-Protein-Coupled, Databases, Factual, Pharmacology
Abstract: The Concise Guide to PHARMACOLOGY 2023/24 is the sixth in this series of biennial publications. The Concise Guide provides concise overviews, mostly in tabular format, of the key properties of approximately 1800 drug targets, and over 6000 interactions with about 3900 ligands. There is an emphasis on selective pharmacology (where available), plus links to the open access knowledgebase source of drug targets and their ligands (https://www.guidetopharmacology.org/), which provides more detailed views of target and ligand properties. Although the Concise Guide constitutes almost 500 pages, the material presented is substantially reduced compared to information and links presented on the website. It provides a permanent, citable, point-in-time record that will survive database updates. The full contents of this section can be found at http://onlinelibrary.wiley.com/doi/10.1111/bph.16178. Ion channels are one of the six major pharmacological targets into which the Guide is divided, with the others being: G protein-coupled receptors, nuclear hormone receptors, catalytic receptors, enzymes and transporters. These are presented with nomenclature guidance and summary information on the best available pharmacological tools, alongside key references and suggestions for further reading. The landscape format of the Concise Guide is designed to facilitate comparison of related targets from material contemporary to mid-2023, and supersedes data presented in the 2021/22, 2019/20, 2017/18, 2015/16 and 2013/14 Concise Guides and previous Guides to Receptors and Channels. It is produced in close conjunction with the Nomenclature and Standards Committee of the International Union of Basic and Clinical Pharmacology (NC-IUPHAR), therefore, providing official IUPHAR classification and nomenclature for human drug targets, where appropriate.
Published: 2023

19. Pharmacokinetic analysis across studies to drive knowledge-integration: A tutorial on individual patient data meta-analysis (IPDMA).

Author: Solans, Belén, van Wijk, Rob, Imperial, Marjorie, and Savic, Radojka
Subjects: Humans, Databases, Factual, Pharmacokinetics, Drug Development, Meta-Analysis as Topic
Abstract: Answering challenging questions in drug development sometimes requires pharmacokinetic (PK) data analysis across different studies, for example, to characterize PKs across diverse regions or populations, or to increase statistical power for subpopulations by combining smaller size trials. Given the growing interest in data sharing and advanced computational methods, knowledge integration based on multiple data sources is increasingly applied in the context of model-informed drug discovery and development. A powerful analysis method is the individual patient data meta-analysis (IPDMA), leveraging systematic review of databases and literature, with the most detailed data type of the individual patient, and quantitative modeling of the PK processes, including capturing heterogeneity of variance between studies. The methodology that should be used in IPDMA in the context of population PK analysis is summarized in this tutorial, highlighting areas of special attention compared to standard PK modeling, including hierarchical nested variability terms for interstudy variability, and handling between-assay differences in limits of quantification within a single analysis. This tutorial is intended for any pharmacological modeler who is interested in performing an integrated analysis of PK data across different studies in a systematic and thorough manner, to answer questions that transcend individual primary studies.
Published: 2023

20. Why the Early Paleozoic was intrinsically prone to marine extinction.

Author: Pohl, Alexandre, Stockey, Richard, Dai, Xu, Yohler, Ryan, Le Hir, Guillaume, Hülse, Dominik, Brayard, Arnaud, Finnegan, Seth, and Ridgwell, Andrew
Subjects: Animals, Biodiversity, Climate, Databases, Factual, Earth, Planet, Extinction, Biological
Abstract: The geological record of marine animal biodiversity reflects the interplay between changing rates of speciation versus extinction. Compared to mass extinctions, background extinctions have received little attention. To disentangle the different contributions of global climate state, continental configuration, and atmospheric oxygen concentration (pO2) to variations in background extinction rates, we drive an animal physiological model with the environmental outputs from an Earth system model across intervals spanning the past 541 million years. We find that climate and continental configuration combined to make extinction susceptibility an order of magnitude higher during the Early Paleozoic than during the rest of the Phanerozoic, consistent with extinction rates derived from paleontological databases. The high extinction susceptibility arises in the model from the limited geographical range of marine organisms. It stands even when assuming present-day pO2, suggesting that increasing oxygenation through the Paleozoic is not necessary to explain why extinction rates apparently declined with time.
Published: 2023

21. Effect of no cost sharing for paediatric care on healthcare usage by household income levels: regression discontinuity design.

Author: Fukuma, Shingo, Kato, Hirotaka, Takaku, Reo, and Tsugawa, Yusuke
Subjects: economics, health economics, health policy, public health, quality in health care, Child, Humans, Ambulatory Care, Cost Sharing, Databases, Factual, Elasticity, Health Facilities
Abstract: OBJECTIVES: To investigate the impact of no cost sharing on paediatric care on usage and health outcomes, and whether the effect varies by household income levels. DESIGN: Regression discontinuity design. SETTING: Nationwide medical claims database in Japan. PARTICIPANTS: Children aged younger than 20 years from April 2018 to March 2022. EXPOSURE: Co-insurance rate that increases sharply from 0% to 30% at a certain age threshold (the threshold age varies between 6 and 20 years depending on region). PRIMARY OUTCOME MEASURES: The outpatient care usage (outpatient visit days and healthcare spending for outpatient care) and inpatient care (experience of any hospitalisation and healthcare spending for inpatient care). RESULTS: Of 244 549 children, 49 556 participants were in the bandwidth and thus included in our analyses. Results from the regression discontinuity analysis indicate that no cost sharing was associated with a significant increase in the number of outpatient visit days (+5.26 days; 95% CI, +4.89 to +5.82; p
Published: 2023

22. Serially Combining Epidemiological Designs Does Not Improve Overall Signal Detection in Vaccine Safety Surveillance.

Author: Arshad, Faaizah, Schuemie, Martijn, Bu, Fan, Minty, Evan, Alshammari, Thamir, Lai, Lana, Duarte-Salles, Talita, Fortin, Stephen, Nyberg, Fredrik, Ryan, Patrick, Hripcsak, George, Prieto-Alhambra, Daniel, and Suchard, Marc
Subjects: Humans, Vaccines, Sensitivity and Specificity, Research Design, Databases, Factual, Electronic Health Records
Abstract: INTRODUCTION: Vaccine safety surveillance commonly includes a serial testing approach with a sensitive method for signal generation and specific method for signal validation. The extent to which serial testing in real-world studies improves or hinders overall performance in terms of sensitivity and specificity remains unknown. METHODS: We assessed the overall performance of serial testing using three administrative claims and one electronic health record database. We compared type I and II errors before and after empirical calibration for historical comparator, self-controlled case series (SCCS), and the serial combination of those designs against six vaccine exposure groups with 93 negative control and 279 imputed positive control outcomes. RESULTS: The historical comparator design mostly had fewer type II errors than SCCS. SCCS had fewer type I errors than the historical comparator. Before empirical calibration, the serial combination increased specificity and decreased sensitivity. Type II errors mostly exceeded 50%. After empirical calibration, type I errors returned to nominal; sensitivity was lowest when the methods were combined. CONCLUSION: While serial combination produced fewer false-positive signals compared with the most specific method, it generated more false-negative signals compared with the most sensitive method. Using a historical comparator design followed by an SCCS analysis yielded decreased sensitivity in evaluating safety signals relative to a one-stage SCCS approach. While the current use of serial testing in vaccine surveillance may provide a practical paradigm for signal identification and triage, single epidemiological designs should be explored as valuable approaches to detecting signals.
Published: 2023

23. Direct medical charges of all parties in teen-involved vehicle crashes by culpability.

Author: Zhang, Ling, Hamann, Cara, ONeal, Elizabeth, Yang, Jingzhen, and Peek-Asa, Corinne
Subjects: culpability, teen drivers, vehicle crashes, Humans, Adolescent, Accidents, Traffic, Hospitalization, Databases, Factual, Emergency Service, Hospital, Automobile Driving
Abstract: BACKGROUND: Motor vehicle crashes among teen drivers often involve passengers in the teens vehicle and occupants of other vehicles, and the full cost burden for all individuals is largely unknown. This analysis estimated direct hospitalisation and emergency department charges for teen-involved crashes by teen culpability, comparing charges for the teen driver, passengers and occupants of other vehicles. METHODS: Probabilistic linkage was performed to link the Iowa police crash reports with Iowa emergency department and Iowa hospital inpatient data. Teen drivers aged 14-17 involved in a crash from 2016 through 2020 were included. Teen culpability was determined through the crash report and examined by teen and crash characteristics. Direct medical charges were estimated from charges through linkage to the Iowa hospital inpatient and the Iowa emergency department databases. RESULTS: Among the 28 062 teen drivers involved in vehicle crashes in Iowa between 2016 and 2020, 62.1% were culpable and 37.9% were not culpable. For all parties involved, the inpatient charges were $20.5 million in culpable crashes and $7.2 million in non-culpable crashes. The emergency department charges were $18.7 million in teen culpable crashes and $6.8 million in teen non-culpable crashes. Of the $20.5 million total inpatient charges in which a teen driver was culpable, charges of $9.5 million (46.3%) were for the injured teen driver and $11.0 million (53.7%) for other involved parties. CONCLUSIONS: Culpable teen-involved crashes lead to higher proportions of injury and higher medical charges, with most of these charges covering other individuals in the crash.
Published: 2023

24. A Systematic Review of Specialty Courts in the United States for Adolescents Impacted by Commercial Sexual Exploitation

Author: Godoy, Sarah M, Perris, Georgia E, Thelwell, Mikiko, Osuna-Garcia, Antonia, Barnert, Elizabeth, Bacharach, Amy, and Bath, Eraka P
Subjects: Clinical and Health Psychology, Human Society, Criminology, Psychology, Social Work, Pediatric, Peace, Justice and Strong Institutions, Female, Humans, United States, Adolescent, Substance-Related Disorders, Sexual Behavior, Recidivism, Databases, Factual, treatment, intervention, child abuse, prostitution, sex work, adolescent victims, sexual assault, prostitution/sex work, treatment/intervention, Law, Social work, Clinical and health psychology
Abstract: Nationwide efforts to enhance services for adolescents experiencing commercial sexual exploitation (CSE) in the judicial system have led to the emergence of specialty courts, including human trafficking and girls' courts. Given that prior research has documented competing stances on the effectiveness of specialty courts for CSE-impacted populations, we conducted a systematic review of the literature to identify key characteristics of programming, profiles of adolescents served, and effectiveness of these courts. To identify relevant research and information, we systematically searched scholarly databases and information sources, conducted reference harvesting, and forwarded citation chaining. Articles presenting primary data with quantitative, qualitative, or mixed methodologies or programmatic descriptions of specialty courts serving adolescents at risk or with confirmed histories of CSE that were published after 2004 were included. We identified 39 articles on 21 specialty courts serving adolescents at risk or with confirmed histories of CSE, including seven specialty courts with evaluation or outcome data. Across specialty courts, adolescents benefited from an increase in linkage to specialized services, improved residential placement stability, and reduction in recidivism-measured by new criminal charges. Specialty court participation was also associated with improved educational outcomes and decreased instances of running away. A lack of empirical data, specifically of evaluation studies, emerged as a weakness in the literature. Still, findings support that specialty courts can be an integral judicial system response to CSE. Multidisciplinary collaboration can help target and respond to the multifaceted needs of adolescents, encourage healthy behaviors, and promote their overall wellness.
Published: 2023

25. Guidelines for public database submission of uncultivated virus genome sequences for taxonomic classification

Author: Adriaenssens, Evelien M, Roux, Simon, Brister, J Rodney, Karsch-Mizrachi, Ilene, Kuhn, Jens H, Varsani, Arvind, Yigang, Tong, Reyes, Alejandro, Lood, Cédric, Lefkowitz, Elliot J, Sullivan, Matthew B, Edwards, Robert A, Simmonds, Peter, Rubino, Luisa, Sabanadzovic, Sead, Krupovic, Mart, and Dutilh, Bas E
Subjects: Microbiology, Biological Sciences, Bioinformatics and Computational Biology, Genetics, Genome, Viral, Phylogeny, Databases, Factual
Published: 2023

26. Preoperative medical assessment for adult spinal deformity surgery: a state-of-the-art review.

Author: Arora, Ayush, Cummins, Daniel D, Wague, Aboubacar, Mendelis, Joseph, Samtani, Rahul, McNeill, Ian, Theologis, Alekos A, Mummaneni, Praveen V, and Berven, Sigurd
Subjects: Spine, Humans, Postoperative Complications, Neurosurgical Procedures, Risk Factors, Databases, Factual, Adult, Complications, Deformity, Frailty, Optimization, Quality of care, Risk factors, Patient Safety, Clinical Research, Prevention, Mental Health, 7.3 Management and decision making, Management of diseases and conditions, Good Health and Well Being, Biomedical Engineering, Clinical Sciences
Abstract: IntroductionThe purpose of this study is to provide a state-of-the-art review regarding risk factors for perioperative complications in adult spinal deformity (ASD) surgery. The review includes levels of evidence for risk factors associated with complications in ASD surgery.MethodsUsing the PubMed database, we searched for complications, risk factors, and adult spinal deformity. The included publications were assessed for level of evidence as described in clinical practice guidelines published by the North American Spine Society, with summary statements generated for each risk factor (Bono et al. in Spine J 9:1046-1051, 2009).ResultsFrailty had good evidence (Grade A) as a risk for complications in ASD patients. Fair evidence (Grade B) was assigned for bone quality, smoking, hyperglycemia and diabetes, nutritional status, immunosuppression/steroid use, cardiovascular disease, pulmonary disease, and renal disease. Indeterminate evidence (Grade I) was assigned for pre-operative cognitive function, mental health, social support, and opioid utilization.ConclusionsIdentification of risk factors for perioperative complications in ASD surgery is a priority for empowering informed choices for patients and surgeons and managing patient expectations. Risk factors with grade A and B evidence should be identified prior to elective surgery and modified to reduce the risk of perioperative complications.
Published: 2023

27. Analysis of Female Participant Representation in Registered Oncology Clinical Trials in the United States from 2008 to 2020

Author: Perera, Nirosha D, Bellomo, Tiffany R, Schmidt, Walker M, Litt, Henry K, Shyu, Margaret, Stavins, MaKenna A, Wang, Max M, Bell, Alexander, Saleki, Massoud, Wolf, Katherine I, Ionescu, Ruxandra, Tao, Jacqueline J, Ji, Sunjong, O’Keefe, Ryan M, Pun, Matthew, Takasugi, Jordan M, Steinberg, Jecca R, Go, Ronald S, Turner, Brandon E, and Mahipal, Amit
Subjects: Biomedical and Clinical Sciences, Clinical Sciences, Clinical Trials and Supportive Activities, Cancer, Digestive Diseases, Clinical Research, Male, Humans, Female, United States, Neoplasms, Medical Oncology, Odds Ratio, Databases, Factual, Prevalence, oncology clinical trials, female participant representation, participation to prevalence ratio, Oncology and Carcinogenesis, Oncology & Carcinogenesis, Oncology and carcinogenesis
Abstract: BackgroundFemale underrepresentation in oncology clinical trials can result in outcome disparities. We evaluated female participant representation in US oncology trials by intervention type, cancer site, and funding.Materials and methodsData were extracted from the publicly available Aggregate Analysis of ClinicalTrials.gov database. Initially, 270,172 studies were identified. Following the exclusion of trials using Medical Subject Heading terms, manual review, those with incomplete status, non-US location, sex-specific organ cancers, or lacking participant sex data, 1650 trials consisting of 240,776 participants remained. The primary outcome was participation to prevalence ratio (PPR): percent females among trial participants divided by percent females in the disease population per US Surveillance, Epidemiology, and End Results Program data. PPRs of 0.8-1.2 reflect proportional female representation.ResultsFemales represented 46.9% of participants (95% CI, 45.4-48.4); mean PPR for all trials was 0.912. Females were underrepresented in surgical (PPR 0.74) and other invasive (PPR 0.69) oncology trials. Among cancer sites, females were underrepresented in bladder (odds ratio [OR] 0.48, 95% CI 0.26-0.91, P = .02), head/neck (OR 0.44, 95% CI 0.29-0.68, P < .01), stomach (OR 0.40, 95% CI 0.23-0.70, P < .01), and esophageal (OR 0.40 95% CI 0.22-0.74, P < .01) trials. Hematologic (OR 1.78, 95% CI 1.09-1.82, P < .01) and pancreatic (OR 2.18, 95% CI 1.46-3.26, P < .01) trials had higher odds of proportional female representation. Industry-funded trials had greater odds of proportional female representation (OR 1.41, 95% CI 1.09-1.82, P = .01) than US government and academic-funded trials.ConclusionsStakeholders should look to hematologic, pancreatic, and industry-funded cancer trials as exemplars of female participant representation and consider female representation when interpreting trial results.
Published: 2023

28. Determination of Effect Sizes for Power Analysis for Microbiome Studies Using Large Microbiome Databases.

Author: Rahman, Gibraan, McDonald, Daniel, Gonzalez, Antonio, Vázquez-Baeza, Yoshiki, Jiang, Lingjing, Casals-Pascual, Climent, Hakim, Daniel, Dilmore, Amanda Hazel, Nowinski, Brent, Peddada, Shyamal, and Knight, Rob
Subjects: Software, Databases, Factual, Microbiota, Gastrointestinal Microbiome, bioinformatics, effect size, microbiome, statistics, Human Genome, Complementary and Integrative Health, Genetics, Networking and Information Technology R&D (NITRD)
Abstract: Herein, we present a tool called Evident that can be used for deriving effect sizes for a broad spectrum of metadata variables, such as mode of birth, antibiotics, socioeconomics, etc., to provide power calculations for a new study. Evident can be used to mine existing databases of large microbiome studies (such as the American Gut Project, FINRISK, and TEDDY) to analyze the effect sizes for planning future microbiome studies via power analysis. For each metavariable, the Evident software is flexible to compute effect sizes for many commonly used measures of microbiome analyses, including α diversity, β diversity, and log-ratio analysis. In this work, we describe why effect size and power analysis are necessary for computational microbiome analysis and show how Evident can help researchers perform these procedures. Additionally, we describe how Evident is easy for researchers to use and provide an example of efficient analyses using a dataset of thousands of samples and dozens of metadata categories.
Published: 2023

29. Mapping the common gene networks that underlie related diseases

Author: Rosenthal, Sara Brin, Wright, Sarah N, Liu, Sophie, Churas, Christopher, Chilin-Fuentes, Daisy, Chen, Chi-Hua, Fisch, Kathleen M, Pratt, Dexter, Kreisberg, Jason F, and Ideker, Trey
Subjects: Biological Sciences, Bioinformatics and Computational Biology, Genetics, Biotechnology, Aetiology, 2.1 Biological and endogenous factors, Good Health and Well Being, Humans, Gene Regulatory Networks, Software, Databases, Factual, Computational Biology, Chemical Sciences, Medical and Health Sciences, Bioinformatics
Abstract: A longstanding goal of biomedicine is to understand how alterations in molecular and cellular networks give rise to the spectrum of human diseases. For diseases with shared etiology, understanding the common causes allows for improved diagnosis of each disease, development of new therapies and more comprehensive identification of disease genes. Accordingly, this protocol describes how to evaluate the extent to which two diseases, each characterized by a set of mapped genes, are colocalized in a reference gene interaction network. This procedure uses network propagation to measure the network 'distance' between gene sets. For colocalized diseases, the network can be further analyzed to extract common gene communities at progressive granularities. In particular, we show how to: (1) obtain input gene sets and a reference gene interaction network; (2) identify common subnetworks of genes that encompass or are in close proximity to all gene sets; (3) use multiscale community detection to identify systems and pathways represented by each common subnetwork to generate a network colocalized systems map; (4) validate identified genes and systems using a mouse variant database; and (5) visualize and further investigate select genes, interactions and systems for relevance to phenotype(s) of interest. We demonstrate the utility of this approach by identifying shared biological mechanisms underlying autism and congenital heart disease. However, this protocol is general and can be applied to any gene sets attributed to diseases or other phenotypes with suspected joint association. A typical NetColoc run takes less than an hour. Software and documentation are available at https://github.com/ucsd-ccbb/NetColoc .
Published: 2023

30. Perspective: A Comprehensive Evaluation of Data Quality in Nutrient Databases

Author: Li, Zhaoping, Forester, Shavawn, Jennings-Dobbs, Emily, and Heber, David
Subjects: Biomedical and Clinical Sciences, Nutrition and Dietetics, Nutrition, Zero Hunger, Humans, Data Accuracy, Nutritional Status, Nutrients, Food, Food Analysis, Databases, Factual, food composition data, nutrient data, data quality, essential nutrients, phytonutrients, omics, anthropometrics, human nutrition, precision nutrition, personalized dietary recommendations, Nutrition and dietetics
Abstract: Nutrient databases are a critical component of nutrition science and the basis of exciting new research in precision nutrition (PN). To identify the most critical components needed for improvement of nutrient databases, food composition data were analyzed for quality, with completeness being the most important measure, and for FAIRness, how well the data conformed with the data science criteria of findable, accessible, interoperable, and reusable (FAIR). Databases were judged complete if they provided data for all 15 nutrition fact panel (NFP) nutrient measures and all 40 National Academies of Sciences, Engineering, and Medicine (NASEM) essential nutrient measures for each food listed. Using the gold standard the USDA standard reference (SR) Legacy database as surrogate, it was found that SR Legacy data were not complete for either NFP or NASEM nutrient measures. In addition, phytonutrient measures in the 4 USDA Special Interest Databases were incomplete. To evaluate data FAIRness, a set of 175 food and nutrient data sources were collected from worldwide. Many opportunities were identified for improving data FAIRness, including creating persistent URLs, prioritizing usable data storage formats, providing Globally Unique Identifiers for all foods and nutrients, and implementing citation standards. This review demonstrates that despite important contributions from the USDA and others, food and nutrient databases in their current forms do not yet provide truly comprehensive food composition data. We propose that to enhance the quality and usage of food and nutrient composition data for research scientists and those fashioning various PN tools, the field of nutrition science must step out of its historical comfort zone and improve the foundational nutrient databases used in research by incorporating data science principles, the most central being data quality and data FAIRness.
Published: 2023

31. Reproducible variability: assessing investigator discordance across 9 research teams attempting to reproduce the same observational study.

Author: Ostropolets, Anna, Albogami, Yasser, Conover, Mitchell, Banda, Juan, Baumgartner, William, Blacketer, Clair, Desai, Priyamvada, DuVall, Scott, Fortin, Stephen, Gilbert, James, Golozar, Asieh, Ide, Joshua, Kanter, Andrew, Kern, David, Kim, Chungsoo, Lai, Lana, Li, Chenyu, Liu, Feifan, Lynch, Kristine, Minty, Evan, Neves, Maria, Ng, Ding, Obene, Tontel, Pera, Victor, Pratt, Nicole, Rao, Gowtham, Rappoport, Nadav, Reinecke, Ines, Saroufim, Paola, Shoaibi, Azza, Simon, Katherine, Swerdel, Joel, Voss, Erica, Weaver, James, Zhang, Linying, Hripcsak, George, Ryan, Patrick, and Suchard, Marc
Subjects: credibility, observational data, open science, reproducibility, Humans, Research Personnel, Databases, Factual
Abstract: OBJECTIVE: Observational studies can impact patient care but must be robust and reproducible. Nonreproducibility is primarily caused by unclear reporting of design choices and analytic procedures. This study aimed to: (1) assess how the study logic described in an observational study could be interpreted by independent researchers and (2) quantify the impact of interpretations variability on patient characteristics. MATERIALS AND METHODS: Nine teams of highly qualified researchers reproduced a cohort from a study by Albogami et al. The teams were provided the clinical codes and access to the tools to create cohort definitions such that the only variable part was their logic choices. We executed teams cohort definitions against the database and compared the number of subjects, patient overlap, and patient characteristics. RESULTS: On average, the teams interpretations fully aligned with the master implementation in 4 out of 10 inclusion criteria with at least 4 deviations per team. Cohorts size varied from one-third of the master cohort size to 10 times the cohort size (2159-63 619 subjects compared to 6196 subjects). Median agreement was 9.4% (interquartile range 15.3-16.2%). The teams cohorts significantly differed from the master implementation by at least 2 baseline characteristics, and most of the teams differed by at least 5. CONCLUSIONS: Independent research teams attempting to reproduce the study based on its free-text description alone produce different implementations that vary in the population size and composition. Sharing analytical code supported by a common data model and open-source tools allows reproducing a study unambiguously thereby preserving initial design choices.
Published: 2023

32. Malpractice Litigation Related to Diagnosis and Treatment of Intracranial Aneurysms.

Author: Bajaj, S, Payabvash, S, Wintermark, M, Matouk, C, Seidenwurm, D, Gandhi, D, Parizel, P, Mezrich, J, Malhotra, A, Khan, A, Khunte, M, and Wu, Xiao
Subjects: Humans, United States, Intracranial Aneurysm, Malpractice, Radiologists, Neurosurgeons, Databases, Factual
Abstract: BACKGROUND AND PURPOSE: Approaches to management of intracranial aneurysms are inconsistent, in part due to apprehension relating to potential malpractice claims. The purpose of this article was to review the causes of action underlying medical malpractice lawsuits related to the diagnosis and management of intracranial aneurysms and to identify the factors associated and their outcomes. MATERIALS AND METHODS: We consulted 2 large legal databases in the United States to search for cases in which there were jury awards and settlements related to the diagnosis and management of patients with intracranial aneurysms in the United States. Files were screened to include only those cases in which the cause of action involved negligence in the diagnosis and management of a patient with an intracranial aneurysm. RESULTS: Between 2000 and 2020, two hundred eighty-seven published case summaries were identified, of which 133 were eligible for inclusion in the analysis. Radiologists constituted 16% of 159 physicians sued in these lawsuits. Failure to diagnose was the most common medical malpractice claim referenced (100/133 cases), with the most common subgroups being failure to include cerebral aneurysm as a differential and thus perform adequate work-up (30 cases), and failure to correctly interpret aneurysm evidence on CT or MR imaging (16 cases). Only 6 of these 16 cases were adjudicated at trial, with 2 decided in favor of the plaintiff (awarded $4,000,000 and $43,000,000, respectively). CONCLUSIONS: Incorrect interpretation of imaging is relatively infrequent as a cause of malpractice litigation compared with failure to diagnose aneurysms in the clinical setting by neurosurgeons, emergency physicians, and primary care providers.
Published: 2023

33. Exploring the complexity and spectrum of racial/ethnic disparities in colon cancer management.

Author: Greenberg, Anya L, Brand, Nathan R, Zambeli-Ljepović, Alan, Barnes, Katherine E, Chiou, Sy Han, Rhoads, Kim F, Adam, Mohamed A, and Sarin, Ankit
Subjects: Humans, Colonic Neoplasms, Treatment Outcome, Databases, Factual, Health Services Accessibility, United States, Female, Male, Healthcare Disparities, Race Factors, Ethnicity, Hispanic or Latino, Native Hawaiian or Other Pacific Islander, Racial Groups, Black or African American, Asian, American Indian or Alaska Native, South Asian People, Southeast Asian People, East Asian People, Colon cancer, Colorectal surgery, Healthcare access, Racial disparities, Colo-Rectal Cancer, Prevention, Digestive Diseases, Cancer, Clinical Research, Good Health and Well Being, Public Health and Health Services, Sociology, Public Health
Abstract: BackgroundColorectal cancer is a leading cause of morbidity and mortality across U.S. racial/ethnic groups. Existing studies often focus on a particular race/ethnicity or single domain within the care continuum. Granular exploration of disparities among different racial/ethnic groups across the entire colon cancer care continuum is needed. We aimed to characterize differences in colon cancer outcomes by race/ethnicity across each stage of the care continuum.MethodsWe used the 2010-2017 National Cancer Database to examine differences in outcomes by race/ethnicity across six domains: clinical stage at presentation; timing of surgery; access to minimally invasive surgery; post-operative outcomes; utilization of chemotherapy; and cumulative incidence of death. Analysis was via multivariable logistic or median regression, with select demographics, hospital factors, and treatment details as covariates.Results326,003 patients (49.6% female, 24.0% non-White, including 12.7% Black, 6.1% Hispanic/Spanish, 1.3% East Asian, 0.9% Southeast Asian, 0.4% South Asian, 0.3% AIAE, and 0.2% NHOPI) met inclusion criteria. Relative to non-Hispanic White patients: Southeast Asian (OR 1.39, p
Published: 2023

34. NDEx IQuery: a multi-method network gene set analysis leveraging the Network Data Exchange

Author: Pillich, Rudolf T, Chen, Jing, Churas, Christopher, Fong, Dylan, Gyori, Benjamin M, Ideker, Trey, Karis, Klas, Liu, Sophie N, Ono, Keiichiro, Pico, Alexander, and Pratt, Dexter
Subjects: Biological Sciences, Bioinformatics and Computational Biology, Genetics, Biotechnology, Networking and Information Technology R&D (NITRD), Underpinning research, 1.5 Resources and infrastructure (underpinning), Generic health relevance, Computational Biology, Software, Protein Interaction Maps, Publications, Databases, Factual, Internet, Mathematical Sciences, Information and Computing Sciences, Bioinformatics, Biological sciences, Information and computing sciences, Mathematical sciences
Abstract: MotivationThe investigation of sets of genes using biological pathways is a common task for researchers and is supported by a wide variety of software tools. This type of analysis generates hypotheses about the biological processes that are active or modulated in a specific experimental context.ResultsThe Network Data Exchange Integrated Query (NDEx IQuery) is a new tool for network and pathway-based gene set interpretation that complements or extends existing resources. It combines novel sources of pathways, integration with Cytoscape, and the ability to store and share analysis results. The NDEx IQuery web application performs multiple gene set analyses based on diverse pathways and networks stored in NDEx. These include curated pathways from WikiPathways and SIGNOR, published pathway figures from the last 27 years, machine-assembled networks using the INDRA system, and the new NCI-PID v2.0, an updated version of the popular NCI Pathway Interaction Database. NDEx IQuery's integration with MSigDB and cBioPortal now provides pathway analysis in the context of these two resources.Availability and implementationNDEx IQuery is available at https://www.ndexbio.org/iquery and is implemented in Javascript and Java.
Published: 2023

35. A modular architecture for organizing, processing and sharing neurophysiology data

Author: Bonacchi, Niccolò, Chapuis, Gaelle A, Churchland, Anne K, DeWitt, Eric EJ, Faulkner, Mayo, Harris, Kenneth D, Huntenburg, Julia M, Hunter, Max, Laranjeira, Inês C, Rossant, Cyrille, Sasaki, Maho, Schartner, Michael M, Shen, Shan, Steinmetz, Nicholas A, Walker, Edgar Y, West, Steven J, Winter, Olivier, and Wells, Miles J
Subjects: Biological Sciences, Behavioral and Social Science, Basic Behavioral and Social Science, Neurophysiology, Databases, Factual, Laboratories, International Brain Laboratory, Technology, Medical and Health Sciences, Developmental Biology, Biological sciences
Abstract: We describe an architecture for organizing, integrating and sharing neurophysiology data within a single laboratory or across a group of collaborators. It comprises a database linking data files to metadata and electronic laboratory notes; a module collecting data from multiple laboratories into one location; a protocol for searching and sharing data and a module for automatic analyses that populates a website. These modules can be used together or individually, by single laboratories or worldwide collaborations.
Published: 2023

36. Generating synthetic high-resolution spinal STIR and T1w images from T2w FSE and low-resolution axial Dixon

Author: Graf, Robert, Platzek, Paul-Sören, Riedel, Evamaria Olga, Kim, Su Hwan, Lenhart, Nicolas, Ramschütz, Constanze, Paprottka, Karolin Johanna, Kertels, Olivia Ruriko, Möller, Hendrik Kristian, Atad, Matan, Bülow, Robin, Werner, Nicole, Völzke, Henry, Schmidt, Carsten Oliver, Wiestler, Benedikt, Paetzold, Johannes C., Rueckert, Daniel, and Kirschke, Jan Stefan
Published: 2024
Full Text: View/download PDF

37. ZINC-22A Free Multi-Billion-Scale Database of Tangible Compounds for Ligand Discovery

Author: Tingle, Benjamin I, Tang, Khanh G, Castanon, Mar, Gutierrez, John J, Khurelbaatar, Munkhzul, Dandarchuluun, Chinzorig, Moroz, Yurii S, and Irwin, John J
Subjects: Medicinal and Biomolecular Chemistry, Chemical Sciences, Ligands, Databases, Factual, Molecular Conformation, Molecular Docking Simulation, Zinc, Theoretical and Computational Chemistry, Computation Theory and Mathematics, Medicinal & Biomolecular Chemistry, Medicinal and biomolecular chemistry, Theoretical and computational chemistry
Abstract: Purchasable chemical space has grown rapidly into the tens of billions of molecules, providing unprecedented opportunities for ligand discovery but straining the tools that might exploit these molecules at scale. We have therefore developed ZINC-22, a database of commercially accessible small molecules derived from multi-billion-scale make-on-demand libraries. The new database and tools enable analog searching in this vast new space via a facile GUI, CartBlanche, drawing on similarity methods that scale sublinearly in the number of molecules. The new library also uses data organization methods, enabling rapid lookup of molecules and their physical properties, including conformations, partial atomic charges, c Log P values, and solvation energies, all crucial for molecule docking, which had become slow with older database organizations in previous versions of ZINC. As the libraries have continued to grow, we have been interested in finding whether molecular diversity has suffered, for instance, because certain scaffolds have come to dominate via easy analoging. This has not occurred thus far, and chemical diversity continues to grow with database size, with a log increase in Bemis-Murcko scaffolds for every two-log unit increase in database size. Most new scaffolds come from compounds with the highest heavy atom count. Finally, we consider the implications for databases like ZINC as the libraries grow toward and beyond the trillion-molecule range. ZINC is freely available to everyone and may be accessed at cartblanche22.docking.org, via Globus, and in the Amazon AWS and Oracle OCI clouds.
Published: 2023

38. RMechDB: A Public Database of Elementary Radical Reaction Steps

Author: Tavakoli, Mohammadamin, Chiu, Yin Ting T, Baldi, Pierre, Carlton, Ann Marie, and Van Vranken, David
Subjects: Organic Chemistry, Chemical Sciences, Databases, Factual, Computer Simulation, Cheminformatics, Medicinal and Biomolecular Chemistry, Theoretical and Computational Chemistry, Computation Theory and Mathematics, Medicinal & Biomolecular Chemistry, Medicinal and biomolecular chemistry, Theoretical and computational chemistry
Abstract: We introduce RMechDB, an open-access platform for aggregating, curating, and distributing reliable data about elementary radical reaction steps for computational radical reaction modeling and prediction. RMechDB contains over 5,300 elementary radical reaction steps, each with a single transition state at or around room temperature. These elementary step reactions are manually curated plausible arrow-pushing steps for organic radical reactions. The steps were taken from a variety of sources. Over 2,000 mechanistic steps were extracted from textbooks and/or constructed from research publications. Another 3,000 were taken from gas-phase atmospheric reactions of isoprene and other organic molecules on the MCM (Master Chemical Mechanism) Web site. Reactions are encoded in the SMIRKS format with accurate atom mapping and annotations for arrow-pushing mechanisms. At its core, RMechDB consists of a database schema with an online interactive search interface and a request portal for downloading the raw form of elementary step reactions with their metadata. It also offers an interface for submitting new reactions to RMechDB and expanding the data set through community contributions. Although there are several applications for RMechDB, it is primarily designed as a central platform of radical elementary steps with a unified and structured representation. We believe that this open access to this data and platform enables the extension of data-driven models for chemical reaction predictions and other chemoinformatics predictive tasks.
Published: 2023

39. The scalable precision medicine open knowledge engine (SPOKE): a massive knowledge graph of biomedical information

Author: Morris, John H, Soman, Karthik, Akbas, Rabia E, Zhou, Xiaoyuan, Smith, Brett, Meng, Elaine C, Huang, Conrad C, Cerono, Gabriel, Schenk, Gundolf, Rizk-Jackson, Angela, Harroud, Adil, Sanders, Lauren, Costes, Sylvain V, Bharat, Krish, Chakraborty, Arjun, Pico, Alexander R, Mardirossian, Taline, Keiser, Michael, Tang, Alice, Hardi, Josef, Shi, Yongmei, Musen, Mark, Israni, Sharat, Huang, Sui, Rose, Peter W, Nelson, Charlotte A, and Baranzini, Sergio E
Subjects: Good Health and Well Being, Precision Medicine, Pattern Recognition, Automated, Databases, Factual, Mathematical Sciences, Biological Sciences, Information and Computing Sciences, Bioinformatics
Abstract: MotivationKnowledge graphs (KGs) are being adopted in industry, commerce and academia. Biomedical KG presents a challenge due to the complexity, size and heterogeneity of the underlying information.ResultsIn this work, we present the Scalable Precision Medicine Open Knowledge Engine (SPOKE), a biomedical KG connecting millions of concepts via semantically meaningful relationships. SPOKE contains 27 million nodes of 21 different types and 53 million edges of 55 types downloaded from 41 databases. The graph is built on the framework of 11 ontologies that maintain its structure, enable mappings and facilitate navigation. SPOKE is built weekly by python scripts which download each resource, check for integrity and completeness, and then create a 'parent table' of nodes and edges. Graph queries are translated by a REST API and users can submit searches directly via an API or a graphical user interface. Conclusions/Significance: SPOKE enables the integration of seemingly disparate information to support precision medicine efforts.Availability and implementationThe SPOKE neighborhood explorer is available at https://spoke.rbvi.ucsf.edu.Supplementary informationSupplementary data are available at Bioinformatics online.
Published: 2023

40. The Environmental Conditions, Treatments, and Exposures Ontology (ECTO): connecting toxicology and exposure to human health and beyond

Author: Chan, Lauren E, Thessen, Anne E, Duncan, William D, Matentzoglu, Nicolas, Schmitt, Charles, Grondin, Cynthia J, Vasilevsky, Nicole, McMurry, Julie A, Robinson, Peter N, Mungall, Christopher J, and Haendel, Melissa A
Subjects: Information and Computing Sciences, Networking and Information Technology R&D (NITRD), Prevention, Generic health relevance, Good Health and Well Being, Humans, Biological Ontologies, Databases, Factual, Biomedical ontology, Environmental exposures, Environmental health, Other Biological Sciences, Artificial Intelligence and Image Processing, Information Systems, Information and computing sciences
Abstract: BackgroundEvaluating the impact of environmental exposures on organism health is a key goal of modern biomedicine and is critically important in an age of greater pollution and chemicals in our environment. Environmental health utilizes many different research methods and generates a variety of data types. However, to date, no comprehensive database represents the full spectrum of environmental health data. Due to a lack of interoperability between databases, tools for integrating these resources are needed. In this manuscript we present the Environmental Conditions, Treatments, and Exposures Ontology (ECTO), a species-agnostic ontology focused on exposure events that occur as a result of natural and experimental processes, such as diet, work, or research activities. ECTO is intended for use in harmonizing environmental health data resources to support cross-study integration and inference for mechanism discovery.Methods and findingsECTO is an ontology designed for describing organismal exposures such as toxicological research, environmental variables, dietary features, and patient-reported data from surveys. ECTO utilizes the base model established within the Exposure Ontology (ExO). ECTO is developed using a combination of manual curation and Dead Simple OWL Design Patterns (DOSDP), and contains over 2700 environmental exposure terms, and incorporates chemical and environmental ontologies. ECTO is an Open Biological and Biomedical Ontology (OBO) Foundry ontology that is designed for interoperability, reuse, and axiomatization with other ontologies. ECTO terms have been utilized in axioms within the Mondo Disease Ontology to represent diseases caused or influenced by environmental factors, as well as for survey encoding for the Personalized Environment and Genes Study (PEGS).ConclusionsWe constructed ECTO to meet Open Biological and Biomedical Ontology (OBO) Foundry principles to increase translation opportunities between environmental health and other areas of biology. ECTO has a growing community of contributors consisting of toxicologists, public health epidemiologists, and health care providers to provide the necessary expertise for areas that have been identified previously as gaps.
Published: 2023

41. The Antibody Registry: ten years of registering antibodies

Author: Bandrowski, Anita, Pairish, Mason, Eckmann, Peter, Grethe, Jeffrey, and Martone, Maryann E
Subjects: Biological Sciences, Environmental Sciences, Chemical Sciences, Biotechnology, Antibodies, Databases, Factual, Registries, Information and Computing Sciences, Developmental Biology, Biological sciences, Chemical sciences, Environmental sciences
Abstract: Antibodies are ubiquitous key biological research resources yet are tricky to use as they are prone to performance issues and represent a major source of variability across studies. Understanding what antibody was used in a published study is therefore necessary to repeat and/or interpret a given study. However, antibody reagents are still frequently not cited with sufficient detail to determine which antibody was used in experiments. The Antibody Registry is a public, open database that enables citation of antibodies by providing a persistent record for any antibody-based reagent used in a publication. The registry is the authority for antibody Research Resource Identifiers, or RRIDs, which are requested or required by hundreds of journals seeking to improve the citation of these key resources. The registry is the most comprehensive listing of persistently identified antibody reagents used in the scientific literature. Data contributors span individual authors who use antibodies to antibody companies, which provide their entire catalogs including discontinued items. Unlike many commercial antibody listing sites which tend to remove reagents no longer sold, registry records persist, providing an interface between a fast-moving commercial marketplace and the static scientific literature. The Antibody Registry (RRID:SCR_006397) https://antibodyregistry.org.
Published: 2023

42. Analysis of delay in adjuvant chemotherapy in locally advanced rectal cancer

Author: Farzaneh, CA, Pigazzi, A, Duong, WQ, Carmichael, JC, Stamos, MJ, Dekhordi-Vakil, F, Dayyani, F, Zell, JA, and Jafari, MD
Subjects: Biomedical and Clinical Sciences, Oncology and Carcinogenesis, Colo-Rectal Cancer, Clinical Research, Patient Safety, Digestive Diseases, Cancer, Rare Diseases, 6.1 Pharmaceuticals, Humans, Chemotherapy, Adjuvant, Neoadjuvant Therapy, Chemoradiotherapy, Rectal Neoplasms, Databases, Factual, Retrospective Studies, Neoplasm Staging, Locally advanced rectal cancer, Adjuvant chemotherapy, Clinical Sciences, Surgery, Clinical sciences
Abstract: BackgroundAdjuvant chemotherapy (AC) after neoadjuvant chemoradiation and surgical resection has been the standard of care for locally advanced rectal cancer. However, there are no evidence-based guidelines regarding the optimal timing of AC for rectal cancer. The objective of this study was to evaluate the effect of AC timing on overall survival for rectal cancer.MethodsThe National Cancer Database (NCDB) from 2004 to 2016 was queried for primary clinical stage II or III rectal cancer patients who had undergone neoadjuvant chemoradiation followed by surgery and AC. Patients were grouped based on AC initiation: early ≤ 4 weeks, intermediate 4-8 weeks, and delayed ≥ 8 weeks. The primary outcome was overall survival.ResultsWe identified 8722 patients, of which 905 (10.4%) received early AC, 4621 (53.0%) intermediate AC, and 3196 (36.6%) delayed AC. Pathological lymph-node metastasis (ypN +) was positive in 73% of early AC, 74% intermediate AC, and 63% delayed AC (p
Published: 2023

43. A comparison of neuroelectrophysiology databases

Author: Subash, Priyanka, Gray, Alex, Boswell, Misque, Cohen, Samantha L, Garner, Rachael, Salehi, Sana, Fisher, Calvary, Hobel, Samuel, Ghosh, Satrajit, Halchenko, Yaroslav, Dichter, Benjamin, Poldrack, Russell A, Markiewicz, Chris, Hermes, Dora, Delorme, Arnaud, Makeig, Scott, Behan, Brendan, Sparks, Alana, Arnott, Stephen R, Wang, Zhengjia, Magnotti, John, Beauchamp, Michael S, Pouratian, Nader, Toga, Arthur W, and Duncan, Dominique
Subjects: Information and Computing Sciences, Library and Information Studies, Neurosciences, Brain Disorders, Networking and Information Technology R&D (NITRD), Neurophysiology, Databases, Factual, Information Dissemination
Abstract: As data sharing has become more prevalent, three pillars - archives, standards, and analysis tools - have emerged as critical components in facilitating effective data sharing and collaboration. This paper compares four freely available intracranial neuroelectrophysiology data repositories: Data Archive for the BRAIN Initiative (DABI), Distributed Archives for Neurophysiology Data Integration (DANDI), OpenNeuro, and Brain-CODE. The aim of this review is to describe archives that provide researchers with tools to store, share, and reanalyze both human and non-human neurophysiology data based on criteria that are of interest to the neuroscientific community. The Brain Imaging Data Structure (BIDS) and Neurodata Without Borders (NWB) are utilized by these archives to make data more accessible to researchers by implementing a common standard. As the necessity for integrating large-scale analysis into data repository platforms continues to grow within the neuroscientific community, this article will highlight the various analytical and customizable tools developed within the chosen archives that may advance the field of neuroinformatics.
Published: 2023

44. Family and personal history of cancer in the All of Us research program for precision medicine

Author: Bruce, Lauryn Keeler, Paul, Paulina, Kim, Katherine K, Kim, Jihoon, Keegan, Theresa HM, Hiatt, Robert A, Ohno-Machado, Lucila, and Investigators, On behalf of the All of Us Research Program
Subjects: Health Services and Systems, Health Sciences, Precision Medicine, Burden of Illness, Cancer, Good Health and Well Being, Humans, Population Health, Neoplasms, Surveys and Questionnaires, Databases, Factual, All of Us Research Program Investigators, General Science & Technology
Abstract: The All of Us (AoU) Research Program is making available one of the largest and most diverse collections of health data in the US to researchers. Using the All of Us database, we evaluated family and personal histories of five common types of cancer in 89,453 individuals, comparing these data to 24,305 participants from the 2015 National Health Interview Survey (NHIS). Comparing datasets, we found similar family cancer history (33%) rates, but higher personal cancer history in the AoU dataset (9.2% in AoU vs. 5.11% in NHIS), Methodological (e.g. survey-versus telephone-based data collection) and demographic variability may explain these between-data differences, but more research is needed.
Published: 2023

45. Phenopacket-tools: Building and validating GA4GH Phenopackets

Author: Danis, Daniel, Jacobsen, Julius OB, Wagner, Alex H, Groza, Tudor, Beckwith, Martha A, Rekerle, Lauren, Carmody, Leigh C, Reese, Justin, Hegde, Harshad, Ladewig, Markus S, Seitz, Berthold, Munoz-Torres, Monica, Harris, Nomi L, Rambla, Jordi, Baudis, Michael, Mungall, Christopher J, Haendel, Melissa A, and Robinson, Peter N
Subjects: Information and Computing Sciences, Biological Sciences, Genetics, Precision Medicine, Biotechnology, Human Genome, 1.5 Resources and infrastructure (underpinning), 2.6 Resources and infrastructure (aetiology), Humans, Software, Genomics, Databases, Factual, Neoplasms, Gene Library, General Science & Technology
Abstract: The Global Alliance for Genomics and Health (GA4GH) is a standards-setting organization that is developing a suite of coordinated standards for genomics. The GA4GH Phenopacket Schema is a standard for sharing disease and phenotype information that characterizes an individual person or biosample. The Phenopacket Schema is flexible and can represent clinical data for any kind of human disease including rare disease, complex disease, and cancer. It also allows consortia or databases to apply additional constraints to ensure uniform data collection for specific goals. We present phenopacket-tools, an open-source Java library and command-line application for construction, conversion, and validation of phenopackets. Phenopacket-tools simplifies construction of phenopackets by providing concise builders, programmatic shortcuts, and predefined building blocks (ontology classes) for concepts such as anatomical organs, age of onset, biospecimen type, and clinical modifiers. Phenopacket-tools can be used to validate the syntax and semantics of phenopackets as well as to assess adherence to additional user-defined requirements. The documentation includes examples showing how to use the Java library and the command-line tool to create and validate phenopackets. We demonstrate how to create, convert, and validate phenopackets using the library or the command-line application. Source code, API documentation, comprehensive user guide and a tutorial can be found at https://github.com/phenopackets/phenopacket-tools. The library can be installed from the public Maven Central artifact repository and the application is available as a standalone archive. The phenopacket-tools library helps developers implement and standardize the collection and exchange of phenotypic and other clinical data for use in phenotype-driven genomic diagnostics, translational research, and precision medicine applications.
Published: 2023

46. kboolnet: a toolkit for the verification, validation, and visualization of reaction-contingency (rxncon) models

Author: Carretero Chavez, Willow, Krantz, Marcus, Klipp, Edda, and Kufareva, Irina
Subjects: Bioengineering, Networking and Information Technology R&D (NITRD), Databases, Factual, Documentation, Kinetics, Signal Transduction, Software, Computational modeling, Cell signaling, Network biology, Rxncon, Boolean networks, Mathematical Sciences, Biological Sciences, Information and Computing Sciences, Bioinformatics
Abstract: BackgroundComputational models of cell signaling networks are extremely useful tools for the exploration of underlying system behavior and prediction of response to various perturbations. By representing signaling cascades as executable Boolean networks, the previously developed rxncon ("reaction-contingency") formalism and associated Python package enable accurate and scalable modeling of signal transduction even in large (thousands of components) biological systems. The models are split into reactions, which generate states, and contingencies, that impinge on reactions; this avoids the so-called "combinatorial explosion" of system size. Boolean description of the biological system compensates for the poor availability of kinetic parameters which are necessary for quantitative models. Unfortunately, few tools are available to support rxncon model development, especially for large, intricate systems.ResultsWe present the kboolnet toolkit ( https://github.com/Kufalab-UCSD/kboolnet , complete documentation at https://github.com/Kufalab-UCSD/kboolnet/wiki ), an R package and a set of scripts that seamlessly integrate with the python-based rxncon software and collectively provide a complete workflow for the verification, validation, and visualization of rxncon models. The verification script VerifyModel.R checks for responsiveness to repeated stimulations as well as consistency of steady state behavior. The validation scripts TruthTable.R, SensitivityAnalysis.R, and ScoreNet.R provide various readouts for the comparison of model predictions to experimental data. In particular, ScoreNet.R compares model predictions to a cloud-stored MIDAS-format experimental database to provide a numerical score for tracking model accuracy. Finally, the visualization scripts allow for graphical representations of model topology and behavior. The entire kboolnet toolkit is cloud-enabled, allowing for easy collaborative development; most scripts also allow for the extraction and analysis of individual user-defined "modules".ConclusionThe kboolnet toolkit provides a modular, cloud-enabled workflow for the development of rxncon models, as well as their verification, validation, and visualization. This will enable the creation of larger, more comprehensive, and more rigorous models of cell signaling using the rxncon formalism in the future.
Published: 2023

47. MIDAS2: Metagenomic Intra-species Diversity Analysis System

Author: Zhao, Chunyu, Dimitrov, Boris, Goldman, Miriam, Nayfach, Stephen, and Pollard, Katherine S
Subjects: Human Genome, Genetics, Metagenome, Software, Metagenomics, Genotype, Databases, Factual, Mathematical Sciences, Biological Sciences, Information and Computing Sciences, Bioinformatics
Abstract: SummaryThe Metagenomic Intra-Species Diversity Analysis System (MIDAS) is a scalable metagenomic pipeline that identifies single nucleotide variants (SNVs) and gene copy number variants in microbial populations. Here, we present MIDAS2, which addresses the computational challenges presented by increasingly large reference genome databases, while adding functionality for building custom databases and leveraging paired-end reads to improve SNV accuracy. This fast and scalable reengineering of the MIDAS pipeline enables thousands of metagenomic samples to be efficiently genotyped.Availability and implementationThe source code is available at https://github.com/czbiohub/MIDAS2.Supplementary informationSupplementary data are available at Bioinformatics online.
Published: 2023

48. Detecting 3D syndromic faces as outliers using unsupervised normalizing flow models.

Author: Bannister, Jordan, Wilms, Matthias, Aponte, J, Katz, David, Bernier, Francois, Spritz, Richard, Hallgrímsson, Benedikt, Forkert, Nils, and Klein, Ophir
Subjects: 3D facial shape, Computer-assisted diagnosis, Genetic syndrome, Normalizing flow, Outlier detection, Humans, Syndrome, Databases, Factual
Abstract: Many genetic syndromes are associated with distinctive facial features. Several computer-assisted methods have been proposed that make use of facial features for syndrome diagnosis. Training supervised classifiers, the most common approach for this purpose, requires large, comprehensive, and difficult to collect databases of syndromic facial images. In this work, we use unsupervised, normalizing flow-based manifold and density estimation models trained entirely on unaffected subjects to detect syndromic 3D faces as statistical outliers. Furthermore, we demonstrate a general, user-friendly, gradient-based interpretability mechanism that enables clinicians and patients to understand model inferences. 3D facial surface scans of 2471 unaffected subjects and 1629 syndromic subjects representing 262 different genetic syndromes were used to train and evaluate the models. The flow-based models outperformed unsupervised comparison methods, with the best model achieving an ROC-AUC of 86.3% on a challenging, age and sex diverse data set. In addition to highlighting the viability of outlier-based syndrome screening tools, our methods generalize and extend previously proposed outlier scores for 3D face-based syndrome detection, resulting in improved performance for unsupervised syndrome detection.
Published: 2022

49. Comprehensive analysis of retracted journal articles in the field of veterinary medicine and animal health

Author: Christopher, Mary M
Subjects: Veterinary Sciences, Agricultural, Veterinary and Food Sciences, Animals, Biomedical Research, China, Databases, Factual, Plagiarism, Scientific Misconduct, Veterinary Medicine, Editorial policies, Publication ethics, Publication misconduct, Research misconduct, Veterinary journals, Biochemistry and Cell Biology, Microbiology, Veterinary sciences
Abstract: BackgroundRetractions are a key proxy for recognizing errors in research and publication and for reconciling misconduct in the scientific literature. The underlying factors associated with retractions can provide insight and guide policy for journal editors and authors within a discipline. The goal of this study was to systematically review and analyze retracted articles in veterinary medicine and animal health. A database search for retractions of articles with a veterinary/animal health topic, in a veterinary journal, or by veterinary institution-affiliated authors was conducted from first available records through February 2019 in MEDLINE/PubMed, Web of Science, Scopus, Retraction Watch, and Google Scholar. Annual frequency of retractions, journal and article characteristics, author affiliation and country, reasons for retraction, and retraction outcomes were recorded.ResultsTwo-hundred-forty-two articles retracted between 1993 and 2019 were included in the study. Over this period, the estimated rate of retraction increased from 0.03/1000 to 1.07/1000 veterinary articles. Median time from publication to retraction was 478 days (range 0-3653 days). Retracted articles were published in 30 (12.3%) veterinary journals and 132 (81.5%) nonveterinary journals. Veterinary journals had disproportionately more retractions than nonveterinary journals (P = .0155). Authors/groups with ≥2 retractions accounted for 37.2% of retractions. Authors from Iran and China published 19.4 and 18.2% of retracted articles respectively. Authors were affiliated with a faculty of veterinary medicine in 59.1% of retracted articles. Of 242 retractions, 204 (84.3%) were research articles, of which 6.4% were veterinary clinical research. Publication misconduct (plagiarism, duplicate publication, compromised peer review) accounted for 75.6% of retractions, compared with errors (20.6%) and research misconduct (18.2%). Journals published by societies/institutions were less likely than those from commercial publishers to indicate a reason for retraction. Thirty-one percent of HTML articles and 14% of PDFs were available online but not marked as retracted.ConclusionsThe rate of retraction in the field of veterinary and animal health has increased by ~ 10-fold per 1000 articles since 1993, resulting primarily from increased publication misconduct, often by repeat offenders. Veterinary journals and society/institutional journals could benefit from improvement in the quality of retraction notices.
Published: 2022

50. NEMAR: an open access data, tools and compute resource operating on neuroelectromagnetic data

Author: Delorme, Arnaud, Truong, Dung, Youn, Choonhan, Sivagnanam, Subhashini, Stirm, Claire, Yoshimoto, Kenneth, Poldrack, Russell A, Majumdar, Amitava, and Makeig, Scott
Subjects: Information and Computing Sciences, Biological Sciences, Bioinformatics and Computational Biology, Biomedical Imaging, Bioengineering, Networking and Information Technology R&D (NITRD), Neurosciences, 1.5 Resources and infrastructure (underpinning), Underpinning research, Neurological, Humans, Access to Information, Databases, Factual, Magnetic Resonance Imaging, Data Format, Library and Information Studies, Bioinformatics and computational biology, Data management and data science
Abstract: To preserve scientific data created by publicly and/or philanthropically funded research projects and to make it ready for exploitation using recent and ongoing advances in advanced and large-scale computational modeling methods, publicly available data must use in common, now-evolving standards for formatting, identifying and annotating should share data. The OpenNeuro.org archive, built first as a repository for magnetic resonance imaging data based on the Brain Imaging Data Structure formatting standards, aims to house and share all types of human neuroimaging data. Here, we present NEMAR.org, a web gateway to OpenNeuro data for human neuroelectromagnetic data. NEMAR allows users to search through, visually explore and assess the quality of shared electroencephalography (EEG), magnetoencephalography and intracranial EEG data and then to directly process selected data using high-performance computing resources of the San Diego Supercomputer Center via the Neuroscience Gateway (nsgportal.org, NSG), a freely available web portal to high-performance computing serving a variety of neuroscientific analysis environments and tools. Combined, OpenNeuro, NEMAR and NSG form an efficient, integrated data, tools and compute resource for human neuroimaging data analysis and meta-analysis. Database URL: https://nemar.org.
Published: 2022

Catalog

Books, media, physical & digital resources

See catalog results

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Language

Publication Type

Journal

Region

Database

Publisher

4,226 results on '"Factual"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources