263 results on '"Allen, Jonathan E."'
Search Results
2. HD-Bind: Encoding of Molecular Structure with Low Precision, Hyperdimensional Binary Representations
- Author
-
Jones, Derek, Allen, Jonathan E., Zhang, Xiaohua, Khaleghi, Behnam, Kang, Jaeyoung, Xu, Weihong, Moshiri, Niema, and Rosing, Tajana S.
- Subjects
Quantitative Biology - Biomolecules ,Computer Science - Machine Learning - Abstract
Publicly available collections of drug-like molecules have grown to comprise 10s of billions of possibilities in recent history due to advances in chemical synthesis. Traditional methods for identifying ``hit'' molecules from a large collection of potential drug-like candidates have relied on biophysical theory to compute approximations to the Gibbs free energy of the binding interaction between the drug to its protein target. A major drawback of the approaches is that they require exceptional computing capabilities to consider for even relatively small collections of molecules. Hyperdimensional Computing (HDC) is a recently proposed learning paradigm that is able to leverage low-precision binary vector arithmetic to build efficient representations of the data that can be obtained without the need for gradient-based optimization approaches that are required in many conventional machine learning and deep learning approaches. This algorithmic simplicity allows for acceleration in hardware that has been previously demonstrated for a range of application areas. We consider existing HDC approaches for molecular property classification and introduce two novel encoding algorithms that leverage the extended connectivity fingerprint (ECFP) algorithm. We show that HDC-based inference methods are as much as 90 times more efficient than more complex representative machine learning methods and achieve an acceleration of nearly 9 orders of magnitude as compared to inference with molecular docking. We demonstrate multiple approaches for the encoding of molecular data for HDC and examine their relative performance on a range of challenging molecular property prediction and drug-protein binding classification tasks. Our work thus motivates further investigation into molecular representation learning to develop ultra-efficient pre-screening tools.
- Published
- 2023
3. Evaluating Point-Prediction Uncertainties in Neural Networks for Drug Discovery
- Author
-
Fan, Ya Ju, Allen, Jonathan E., McLoughlin, Kevin S., Shi, Da, Bennion, Brian J., Zhang, Xiaohua, and Lightstone, Felice C.
- Subjects
Computer Science - Machine Learning ,Statistics - Applications - Abstract
Neural Network (NN) models provide potential to speed up the drug discovery process and reduce its failure rates. The success of NN models require uncertainty quantification (UQ) as drug discovery explores chemical space beyond the training data distribution. Standard NN models do not provide uncertainty information. Methods that combine Bayesian models with NN models address this issue, but are difficult to implement and more expensive to train. Some methods require changing the NN architecture or training procedure, limiting the selection of NN models. Moreover, predictive uncertainty can come from different sources. It is important to have the ability to separately model different types of predictive uncertainty, as the model can take assorted actions depending on the source of uncertainty. In this paper, we examine UQ methods that estimate different sources of predictive uncertainty for NN models aiming at drug discovery. We use our prior knowledge on chemical compounds to design the experiments. By utilizing a visualization method we create non-overlapping and chemically diverse partitions from a collection of chemical compounds. These partitions are used as training and test set splits to explore NN model uncertainty. We demonstrate how the uncertainties estimated by the selected methods describe different sources of uncertainty under different partitions and featurization schemes and the relationship to prediction error.
- Published
- 2022
4. Metagenomic Methods for Addressing NASA's Planetary Protection Policy Requirements on Future Missions: A Workshop Report.
- Author
-
Green, Stefan J, Torok, Tamas, Allen, Jonathan E, Eloe-Fadrosh, Emiley, Jackson, Scott A, Jiang, Sunny C, Levine, Stuart S, Levy, Shawn, Schriml, Lynn M, Thomas, W Kelley, Wood, Jason M, and Tighe, Scott W
- Subjects
Contamination ,DNA ,Metagenomics ,Planetary protection ,Spacecraft Assembly Facility ,Astronomical and Space Sciences ,Geochemistry ,Geology ,Astronomy & Astrophysics - Abstract
Molecular biology methods and technologies have advanced substantially over the past decade. These new molecular methods should be incorporated among the standard tools of planetary protection (PP) and could be validated for incorporation by 2026. To address the feasibility of applying modern molecular techniques to such an application, NASA conducted a technology workshop with private industry partners, academics, and government agency stakeholders, along with NASA staff and contractors. The technical discussions and presentations of the Multi-Mission Metagenomics Technology Development Workshop focused on modernizing and supplementing the current PP assays. The goals of the workshop were to assess the state of metagenomics and other advanced molecular techniques in the context of providing a validated framework to supplement the bacterial endospore-based NASA Standard Assay and to identify knowledge and technology gaps. In particular, workshop participants were tasked with discussing metagenomics as a stand-alone technology to provide rapid and comprehensive analysis of total nucleic acids and viable microorganisms on spacecraft surfaces, thereby allowing for the development of tailored and cost-effective microbial reduction plans for each hardware item on a spacecraft. Workshop participants recommended metagenomics approaches as the only data source that can adequately feed into quantitative microbial risk assessment models for evaluating the risk of forward (exploring extraterrestrial planet) and back (Earth harmful biological) contamination. Participants were unanimous that a metagenomics workflow, in tandem with rapid targeted quantitative (digital) PCR, represents a revolutionary advance over existing methods for the assessment of microbial bioburden on spacecraft surfaces. The workshop highlighted low biomass sampling, reagent contamination, and inconsistent bioinformatics data analysis as key areas for technology development. Finally, it was concluded that implementing metagenomics as an additional workflow for addressing concerns of NASA's robotic mission will represent a dramatic improvement in technology advancement for PP and will benefit future missions where mission success is affected by backward and forward contamination.
- Published
- 2023
5. Multiple Mutations Associated with Emergent Variants Can Be Detected as Low-Frequency Mutations in Early SARS-CoV-2 Pandemic Clinical Samples
- Author
-
Kimbrel, Jeffrey, Moon, Joseph, Avila-Herrera, Aram, Martí, Jose Manuel, Thissen, James, Mulakken, Nisha, Sandholtz, Sarah H, Ferrell, Tyshawn, Daum, Chris, Hall, Sara, Segelke, Brent, Arrildt, Kathryn T, Messenger, Sharon, Wadford, Debra A, Jaing, Crystal, Allen, Jonathan E, and Borucki, Monica K
- Subjects
Biological Sciences ,Bioinformatics and Computational Biology ,Coronaviruses ,Genetics ,Infectious Diseases ,Emerging Infectious Diseases ,2.1 Biological and endogenous factors ,Infection ,Good Health and Well Being ,Humans ,COVID-19 ,Pandemics ,SARS-CoV-2 ,Mutation ,Computational Biology ,Spike Glycoprotein ,Coronavirus ,LoFreq ,emergence ,evolution ,iSNV ,mutation ,quasispecies ,sequence ,severe acute respiratory syndrome coronavirus-2 ,variant ,Microbiology - Abstract
Genetic analysis of intra-host viral populations provides unique insight into pre-emergent mutations that may contribute to the genotype of future variants. Clinical samples positive for SARS-CoV-2 collected in California during the first months of the pandemic were sequenced to define the dynamics of mutation emergence as the virus became established in the state. Deep sequencing of 90 nasopharyngeal samples showed that many mutations associated with the establishment of SARS-CoV-2 globally were present at varying frequencies in a majority of the samples, even those collected as the virus was first detected in the US. A subset of mutations that emerged months later in consensus sequences were detected as subconsensus members of intra-host populations. Spike mutations P681H, H655Y, and V1104L were detected prior to emergence in variant genotypes, mutations were detected at multiple positions within the furin cleavage site, and pre-emergent mutations were identified in the nucleocapsid and the envelope genes. Because many of the samples had a very high depth of coverage, a bioinformatics pipeline, "Mappgene", was established that uses both iVar and LoFreq variant calling to enable identification of very low-frequency variants. This enabled detection of a spike protein deletion present in many samples at low frequency and associated with a variant of concern.
- Published
- 2022
6. MACAW: An Accessible Tool for Molecular Embedding and Inverse Molecular Design
- Author
-
Blay, Vincent, Radivojevic, Tijana, Allen, Jonathan E, Hudson, Corey M, and Martin, Hector Garcia
- Subjects
Medicinal and Biomolecular Chemistry ,Chemical Sciences ,Algorithms ,Octanes ,Protein Binding ,Receptors ,Histamine H1 ,Theoretical and Computational Chemistry ,Computation Theory and Mathematics ,Medicinal & Biomolecular Chemistry ,Medicinal and biomolecular chemistry ,Theoretical and computational chemistry - Abstract
The growing capabilities of synthetic biology and organic chemistry demand tools to guide syntheses toward useful molecules. Here, we present Molecular AutoenCoding Auto-Workaround (MACAW), a tool that uses a novel approach to generate molecules predicted to meet a desired property specification (e.g., a binding affinity of 50 nM or an octane number of 90). MACAW describes molecules by embedding them into a smooth multidimensional numerical space, avoiding uninformative dimensions that previous methods often introduce. The coordinates in this embedding provide a natural choice of features for accurately predicting molecular properties, which we demonstrate with examples for cetane and octane numbers, flash points, and histamine H1 receptor binding affinity. The approach is computationally efficient and well-suited to the small- and medium-size datasets commonly used in biosciences. We showcase the utility of MACAW for virtual screening by identifying molecules with high predicted binding affinity to the histamine H1 receptor and limited affinity to the muscarinic M2 receptor, which are targets of medicinal relevance. Combining these predictive capabilities with a novel generative algorithm for molecules allows us to recommend molecules with a desired property value (i.e., inverse molecular design). We demonstrate this capability by recommending molecules with predicted octane numbers of 40, 80, and 120, which is an important characteristic of biofuels. Thus, MACAW augments classical retrosynthesis tools by providing recommendations for molecules on specification.
- Published
- 2022
7. Accelerators for Classical Molecular Dynamics Simulations of Biomolecules
- Author
-
Jones, Derek, Allen, Jonathan E, Yang, Yue, Bennett, William F Drew, Gokhale, Maya, Moshiri, Niema, and Rosing, Tajana S
- Subjects
Chemical Sciences ,Theoretical and Computational Chemistry ,Bioengineering ,Algorithms ,Computers ,Molecular Dynamics Simulation ,Biochemistry and Cell Biology ,Computer Software ,Chemical Physics ,Physical chemistry ,Theoretical and computational chemistry - Abstract
Atomistic Molecular Dynamics (MD) simulations provide researchers the ability to model biomolecular structures such as proteins and their interactions with drug-like small molecules with greater spatiotemporal resolution than is otherwise possible using experimental methods. MD simulations are notoriously expensive computational endeavors that have traditionally required massive investment in specialized hardware to access biologically relevant spatiotemporal scales. Our goal is to summarize the fundamental algorithms that are employed in the literature to then highlight the challenges that have affected accelerator implementations in practice. We consider three broad categories of accelerators: Graphics Processing Units (GPUs), Field-Programmable Gate Arrays (FPGAs), and Application Specific Integrated Circuits (ASICs). These categories are comparatively studied to facilitate discussion of their relative trade-offs and to gain context for the current state of the art. We conclude by providing insights into the potential of emerging hardware platforms and algorithms for MD.
- Published
- 2022
8. Pose Classification Using Three-Dimensional Atomic Structure-Based Neural Networks Applied to Ion Channel–Ligand Docking
- Author
-
Shim, Heesung, Kim, Hyojin, Allen, Jonathan E, and Wulff, Heike
- Subjects
Medicinal and Biomolecular Chemistry ,Chemical Sciences ,5.1 Pharmaceuticals ,Development of treatments and therapeutic interventions ,Generic health relevance ,Ion Channels ,Ligands ,Molecular Docking Simulation ,Neural Networks ,Computer ,Protein Binding ,Theoretical and Computational Chemistry ,Computation Theory and Mathematics ,Medicinal & Biomolecular Chemistry ,Medicinal and biomolecular chemistry ,Theoretical and computational chemistry - Abstract
The identification of promising lead compounds showing pharmacological activities toward a biological target is essential in early stage drug discovery. With the recent increase in available small-molecule databases, virtual high-throughput screening using physics-based molecular docking has emerged as an essential tool in assisting fast and cost-efficient lead discovery and optimization. However, the best scored docking poses are often suboptimal, resulting in incorrect screening and chemical property calculation. We address the pose classification problem by leveraging data-driven machine learning approaches to identify correct docking poses from AutoDock Vina and Glide screens. To enable effective classification of docking poses, we present two convolutional neural network approaches: a three-dimensional convolutional neural network (3D-CNN) and an attention-based point cloud network (PCN) trained on the PDBbind refined set. We demonstrate the effectiveness of our proposed classifiers on multiple evaluation data sets including the standard PDBbind CASF-2016 benchmark data set and various compound libraries with structurally different protein targets including an ion channel data set extracted from Protein Data Bank (PDB) and an in-house KCa3.1 inhibitor data set. Our experiments show that excluding false positive docking poses using the proposed classifiers improves virtual high-throughput screening to identify novel molecules against each target protein compared to the initial screen based on the docking scores.
- Published
- 2022
9. High-Throughput Virtual Screening of Small Molecule Inhibitors for SARS-CoV-2 Protein Targets with Deep Fusion Models
- Author
-
Stevenson, Garrett A., Jones, Derek, Kim, Hyojin, Bennett, W. F. Drew, Bennion, Brian J., Borucki, Monica, Bourguet, Feliza, Epstein, Aidan, Franco, Magdalena, Harmon, Brooke, He, Stewart, Katz, Max P., Kirshner, Daniel, Lao, Victoria, Lau, Edmond Y., Lo, Jacky, McLoughlin, Kevin, Mosesso, Richard, Murugesh, Deepa K., Negrete, Oscar A., Saada, Edwin A., Segelke, Brent, Stefan, Maxwell, Torres, Marisa W., Weilhammer, Dina, Wong, Sergio, Yang, Yue, Zemla, Adam, Zhang, Xiaohua, Zhu, Fangqiang, Lightstone, Felice C., and Allen, Jonathan E.
- Subjects
Computer Science - Machine Learning ,Quantitative Biology - Biomolecules - Abstract
Structure-based Deep Fusion models were recently shown to outperform several physics- and machine learning-based protein-ligand binding affinity prediction methods. As part of a multi-institutional COVID-19 pandemic response, over 500 million small molecules were computationally screened against four protein structures from the novel coronavirus (SARS-CoV-2), which causes COVID-19. Three enhancements to Deep Fusion were made in order to evaluate more than 5 billion docked poses on SARS-CoV-2 protein targets. First, the Deep Fusion concept was refined by formulating the architecture as one, coherently backpropagated model (Coherent Fusion) to improve binding-affinity prediction accuracy. Secondly, the model was trained using a distributed, genetic hyper-parameter optimization. Finally, a scalable, high-throughput screening capability was developed to maximize the number of ligands evaluated and expedite the path to experimental evaluation. In this work, we present both the methods developed for machine learning-based high-throughput screening and results from using our computational pipeline to find SARS-CoV-2 inhibitors.
- Published
- 2021
10. Improved Protein-ligand Binding Affinity Prediction with Structure-Based Deep Fusion Inference
- Author
-
Jones, Derek, Kim, Hyojin, Zhang, Xiaohua, Zemla, Adam, Stevenson, Garrett, Bennett, William D., Kirshner, Dan, Wong, Sergio, Lightstone, Felice, and Allen, Jonathan E.
- Subjects
Quantitative Biology - Biomolecules ,Computer Science - Machine Learning - Abstract
Predicting accurate protein-ligand binding affinity is important in drug discovery but remains a challenge even with computationally expensive biophysics-based energy scoring methods and state-of-the-art deep learning approaches. Despite the recent advances in the deep convolutional and graph neural network based approaches, the model performance depends on the input data representation and suffers from distinct limitations. It is natural to combine complementary features and their inference from the individual models for better predictions. We present fusion models to benefit from different feature representations of two neural network models to improve the binding affinity prediction. We demonstrate effectiveness of the proposed approach by performing experiments with the PDBBind 2016 dataset and its docking pose complexes. The results show that the proposed approach improves the overall prediction compared to the individual neural network models with greater computational efficiency than related biophysics based energy scoring functions. We also discuss the benefit of the proposed fusion inference with several example complexes. The software is made available as open source at https://github.com/llnl/fast.
- Published
- 2020
11. Machine Learning Models to Predict Inhibition of the Bile Salt Export Pump
- Author
-
McLoughlin, Kevin S., Jeong, Claire G., Sweitzer, Thomas D., Minnich, Amanda J., Tse, Margaret J., Bennion, Brian J., Allen, Jonathan E., Calad-Thomson, Stacie, Rush, Thomas S., and Brase, James M.
- Subjects
Quantitative Biology - Quantitative Methods - Abstract
Drug-induced liver injury (DILI) is the most common cause of acute liver failure and a frequent reason for withdrawal of candidate drugs during preclinical and clinical testing. An important type of DILI is cholestatic liver injury, caused by buildup of bile salts within hepatocytes; it is frequently associated with inhibition of bile salt transporters, such as the bile salt export pump (BSEP). Reliable in silico models to predict BSEP inhibition directly from chemical structures would significantly reduce costs during drug discovery and could help avoid injury to patients. Unfortunately, models published to date have been insufficiently accurate to encourage wide adoption. We report our development of classification and regression models for BSEP inhibition with substantially improved performance over previously published models. Our model development leveraged the ATOM Modeling PipeLine (AMPL) developed by the ATOM Consortium, which enabled us to train and evaluate thousands of candidate models. In the course of model development, we assessed a variety of schemes for chemical featurization, dataset partitioning and class labeling, and identified those producing models that generalized best to novel chemical entities. Our best performing classification model was a neural network with ROC AUC = 0.88 on our internal test dataset and 0.89 on an independent external compound set. Our best regression model, the first ever reported for predicting BSEP IC50s, yielded a test set $R^2 = 0.56$ and mean absolute error 0.37, corresponding to a mean 2.3-fold error in predicted IC50s, comparable to experimental variation. These models will thus be useful as inputs to mechanistic predictions of DILI and as part of computational pipelines for drug discovery.
- Published
- 2020
12. AMPL: A Data-Driven Modeling Pipeline for Drug Discovery
- Author
-
Minnich, Amanda J., McLoughlin, Kevin, Tse, Margaret, Deng, Jason, Weber, Andrew, Murad, Neha, Madej, Benjamin D., Ramsundar, Bharath, Rush, Tom, Calad-Thomson, Stacie, Brase, Jim, and Allen, Jonathan E.
- Subjects
Quantitative Biology - Quantitative Methods ,Computer Science - Machine Learning ,Statistics - Machine Learning - Abstract
One of the key requirements for incorporating machine learning into the drug discovery process is complete reproducibility and traceability of the model building and evaluation process. With this in mind, we have developed an end-to-end modular and extensible software pipeline for building and sharing machine learning models that predict key pharma-relevant parameters. The ATOM Modeling PipeLine, or AMPL, extends the functionality of the open source library DeepChem and supports an array of machine learning and molecular featurization tools. We have benchmarked AMPL on a large collection of pharmaceutical datasets covering a wide range of parameters. As a result of these comprehensive experiments, we have found that physicochemical descriptors and deep learning-based graph representations significantly outperform traditional fingerprints in the characterization of molecular features. We have also found that dataset size is directly correlated to prediction performance, and that single-task deep learning models only outperform shallow learners if there is sufficient data. Likewise, dataset size has a direct impact on model predictivity, independent of comprehensive hyperparameter model tuning. Our findings point to the need for public dataset integration or multi-task/transfer learning approaches. Lastly, we found that uncertainty quantification (UQ) analysis may help identify model error; however, efficacy of UQ to filter predictions varies considerably between datasets and featurization/model types. AMPL is open source and available for download at http://github.com/ATOMconsortium/AMPL.
- Published
- 2019
13. Evaluating point-prediction uncertainties in neural networks for protein-ligand binding prediction
- Author
-
Fan, Ya Ju, Allen, Jonathan E., McLoughlin, Kevin S., Shi, Da, Bennion, Brian J., Zhang, Xiaohua, and Lightstone, Felice C.
- Published
- 2023
- Full Text
- View/download PDF
14. Distinguishing between Normal and Cancer Cells Using Autoencoder Node Saliency
- Author
-
Fan, Ya Ju, Allen, Jonathan E., Jacobs, Sam Ade, and Van Essen, Brian C.
- Subjects
Computer Science - Machine Learning ,Statistics - Machine Learning - Abstract
Gene expression profiles have been widely used to characterize patterns of cellular responses to diseases. As data becomes available, scalable learning toolkits become essential to processing large datasets using deep learning models to model complex biological processes. We present an autoencoder to capture nonlinear relationships recovered from gene expression profiles. The autoencoder is a nonlinear dimension reduction technique using an artificial neural network, which learns hidden representations of unlabeled data. We train the autoencoder on a large collection of tumor samples from the National Cancer Institute Genomic Data Commons, and obtain a generalized and unsupervised latent representation. We leverage a HPC-focused deep learning toolkit, Livermore Big Artificial Neural Network (LBANN) to efficiently parallelize the training algorithm, reducing computation times from several hours to a few minutes. With the trained autoencoder, we generate latent representations of a small dataset, containing pairs of normal and cancer cells of various tumor types. A novel measure called autoencoder node saliency (ANS) is introduced to identify the hidden nodes that best differentiate various pairs of cells. We compare our findings of the best classifying nodes with principal component analysis and the visualization of t-distributed stochastic neighbor embedding. We demonstrate that the autoencoder effectively extracts distinct gene features for multiple learning tasks in the dataset., Comment: Second Workshop on HPC Applications in Precision Medicine, June 2018
- Published
- 2019
15. Multiscale analysis for patterns of Zika virus genotype emergence, spread, and consequence
- Author
-
Borucki, Monica K, Collette, Nicole M, Coffey, Lark L, Van Rompay, Koen KA, Hwang, Mona H, Thissen, James B, Allen, Jonathan E, and Zemla, Adam T
- Subjects
Biological Sciences ,Bioinformatics and Computational Biology ,Biomedical and Clinical Sciences ,Genetics ,Biodefense ,Infectious Diseases ,Vector-Borne Diseases ,Biotechnology ,Rare Diseases ,Emerging Infectious Diseases ,2.2 Factors relating to the physical environment ,2.1 Biological and endogenous factors ,2.5 Research design and methodologies (aetiology) ,4.1 Discovery and preclinical testing of markers and technologies ,Infection ,Good Health and Well Being ,Databases ,Genetic ,Datasets as Topic ,Disease Outbreaks ,Genome ,Viral ,Genotype ,Geography ,High-Throughput Nucleotide Sequencing ,Humans ,Models ,Molecular ,Mutation Rate ,Phylogeny ,RNA ,Viral ,Spatio-Temporal Analysis ,Viral Nonstructural Proteins ,Viral Structural Proteins ,Zika Virus ,Zika Virus Infection ,General Science & Technology - Abstract
The question of how Zika virus (ZIKV) changed from a seemingly mild virus to a human pathogen capable of microcephaly and sexual transmission remains unanswered. The unexpected emergence of ZIKV's pathogenicity and capacity for sexual transmission may be due to genetic changes, and future changes in phenotype may continue to occur as the virus expands its geographic range. Alternatively, the sheer size of the 2015-16 epidemic may have brought attention to a pre-existing virulent ZIKV phenotype in a highly susceptible population. Thus, it is important to identify patterns of genetic change that may yield a better understanding of ZIKV emergence and evolution. However, because ZIKV has an RNA genome and a polymerase incapable of proofreading, it undergoes rapid mutation which makes it difficult to identify combinations of mutations associated with viral emergence. As next generation sequencing technology has allowed whole genome consensus and variant sequence data to be generated for numerous virus samples, the task of analyzing these genomes for patterns of mutation has become more complex. However, understanding which combinations of mutations spread widely and become established in new geographic regions versus those that disappear relatively quickly is essential for defining the trajectory of an ongoing epidemic. In this study, multiscale analysis of the wealth of genomic data generated over the course of the epidemic combined with in vivo laboratory data allowed trends in mutations and outbreak trajectory to be assessed. Mutations were detected throughout the genome via deep sequencing, and many variants appeared in multiple samples and in some cases become consensus. Similarly, amino acids that were previously consensus in pre-outbreak samples were detected as low frequency variants in epidemic strains. Protein structural models indicate that most of the mutations associated with the epidemic transmission occur on the exposed surface of viral proteins. At the macroscale level, consensus data was organized into large and interactive databases to allow the spread of individual mutations and combinations of mutations to be visualized and assessed for temporal and geographical patterns. Thus, the use of multiscale modeling for identifying mutations or combinations of mutations that impact epidemic transmission and phenotypic impact can aid the formation of hypotheses which can then be tested using reverse genetics.
- Published
- 2019
16. Using populations of human and microbial genomes for organism detection in metagenomes
- Author
-
Ames, Sasha K, Gardner, Shea N, Marti, Jose Manuel, Slezak, Tom R, Gokhale, Maya B, and Allen, Jonathan E
- Subjects
Microbiology ,Biological Sciences ,Bioinformatics and Computational Biology ,Genetics ,Human Genome ,Infection ,Computational Biology ,Databases ,Nucleic Acid ,Genome ,Microbial ,Humans ,Metagenome ,Metagenomics ,Microbiota ,ROC Curve ,Medical and Health Sciences ,Bioinformatics - Abstract
Identifying causative disease agents in human patients from shotgun metagenomic sequencing (SMS) presents a powerful tool to apply when other targeted diagnostics fail. Numerous technical challenges remain, however, before SMS can move beyond the role of research tool. Accurately separating the known and unknown organism content remains difficult, particularly when SMS is applied as a last resort. The true amount of human DNA that remains in a sample after screening against the human reference genome and filtering nonbiological components left from library preparation has previously been underreported. In this study, we create the most comprehensive collection of microbial and reference-free human genetic variation available in a database optimized for efficient metagenomic search by extracting sequences from GenBank and the 1000 Genomes Project. The results reveal new human sequences found in individual Human Microbiome Project (HMP) samples. Individual samples contain up to 95% human sequence, and 4% of the individual HMP samples contain 10% or more human reads. Left unidentified, human reads can complicate and slow down further analysis and lead to inaccurately labeled microbial taxa and ultimately lead to privacy concerns as more human genome data is collected.
- Published
- 2015
17. Microbial Profiling of Combat Wound Infection through Detection Microarray and Next-Generation Sequencing
- Author
-
Be, Nicholas A, Allen, Jonathan E, Brown, Trevor S, Gardner, Shea N, McLoughlin, Kevin S, Forsberg, Jonathan A, Kirkup, Benjamin C, Chromy, Brett A, Luciw, Paul A, Elster, Eric A, and Jaing, Crystal J
- Subjects
Medical Microbiology ,Biomedical and Clinical Sciences ,Clinical Sciences ,Human Genome ,Biotechnology ,Infectious Diseases ,Genetics ,Infection ,Adult ,Bacteria ,Bacterial Load ,Biota ,High-Throughput Nucleotide Sequencing ,Humans ,Microarray Analysis ,Military Personnel ,Wound Healing ,Wound Infection ,Young Adult ,Biological Sciences ,Agricultural and Veterinary Sciences ,Medical and Health Sciences ,Microbiology ,Clinical sciences ,Medical microbiology - Abstract
Combat wound healing and resolution are highly affected by the resident microbial flora. We therefore sought to achieve comprehensive detection of microbial populations in wounds using novel genomic technologies and bioinformatics analyses. We employed a microarray capable of detecting all sequenced pathogens for interrogation of 124 wound samples from extremity injuries in combat-injured U.S. service members. A subset of samples was also processed via next-generation sequencing and metagenomic analysis. Array analysis detected microbial targets in 51% of all wound samples, with Acinetobacter baumannii being the most frequently detected species. Multiple Pseudomonas species were also detected in tissue biopsy specimens. Detection of the Acinetobacter plasmid pRAY correlated significantly with wound failure, while detection of enteric-associated bacteria was associated significantly with successful healing. Whole-genome sequencing revealed broad microbial biodiversity between samples. The total wound bioburden did not associate significantly with wound outcome, although temporal shifts were observed over the course of treatment. Given that standard microbiological methods do not detect the full range of microbes in each wound, these data emphasize the importance of supplementation with molecular techniques for thorough characterization of wound-associated microbes. Future application of genomic protocols for assessing microbial content could allow application of specialized care through early and rapid identification and management of critical patterns in wound bioburden.
- Published
- 2014
18. GENTANGLE: integrated computational design of gene entanglements.
- Author
-
Martí, Jose Manuel, Hsu, Chloe, Rochereau, Charlotte, Xu, Chenling, Blazejewski, Tomasz, Nisonoff, Hunter, Leonard, Sean P, Kang-Yun, Christina S, Chlebek, Jennifer, Ricci, Dante P, Park, Dan, Wang, Harris, Listgarten, Jennifer, Jiao, Yongqin, and Allen, Jonathan E
- Subjects
MICROBIAL genomes ,MICROBIAL genes ,SOURCE code ,INTEGRATED software ,TEST design - Abstract
Summary The design of two overlapping genes in a microbial genome is an emerging technique for adding more reliable control mechanisms in engineered organisms for increased stability. The design of functional overlapping gene pairs is a challenging procedure, and computational design tools are used to improve the efficiency to deploy successful designs in genetically engineered systems. GENTANGLE (Gene Tuples ArraNGed in overLapping Elements) is a high-performance containerized pipeline for the computational design of two overlapping genes translated in different reading frames of the genome. This new software package can be used to design and test gene entanglements for microbial engineering projects using arbitrary sets of user-specified gene pairs. Availability and implementation The GENTANGLE source code and its submodules are freely available on GitHub at https://github.com/BiosecSFA/gentangle. The DATANGLE (DATA for genTANGLE) repository contains related data and results and is freely available on GitHub at https://github.com/BiosecSFA/datangle. The GENTANGLE container is freely available on Singularity Cloud Library at https://cloud.sylabs.io/library/khyox/gentangle/gentangle.sif. The GENTANGLE repository wiki (https://github.com/BiosecSFA/gentangle/wiki), website (https://biosecsfa.github.io/gentangle/), and user manual contain detailed instructions on how to use the different components of software and data, including examples and reproducing the results. The code is licensed under the GNU Affero General Public License version 3 (https://www.gnu.org/licenses/agpl.html). [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
19. GENTANGLE: integrated computational design of gene entanglements
- Author
-
Martí, Jose Manuel, primary, Hsu, Chloe, additional, Rochereau, Charlotte, additional, Blazejewski, Tomasz, additional, Nisonoff, Hunter, additional, Leonard, Sean Patrick, additional, Kang-Yun, Christina Sora, additional, Chlebek, Jennifer L, additional, Ricci, Dante Paul, additional, Park, Dan Mcfarland, additional, Wang, Harris H., additional, Listgarten, Jennifer, additional, Jiao, Yongqin, additional, and Allen, Jonathan E, additional
- Published
- 2023
- Full Text
- View/download PDF
20. The Genome of the Basidiomycetous Yeast and Human Pathogen Cryptococcus neoformans
- Author
-
Loftus, Brendan J., Fung, Eula, Roncaglia, Paola, Rowley, Don, Amedeo, Paolo, Bruno, Dan, Vamathevan, Jessica, Miranda, Molly, Anderson, Iain J., Fraser, James A., Allen, Jonathan E., Bosdet, Ian E., Brent, Michael R., Chiu, Readman, Doering, Tamara L., Donlin, Maureen J., D'Souza, Cletus A., Fox, Deborah S., Grinberg, Viktoriya, Fu, Jianmin, Fukushima, Marilyn, Haas, Brian J., Huang, James C., Janbon, Guilhem, Koo, Hean L., Krzywinski, Martin I., Kwon-Chung, June K., Lengeler, Klaus B., Maiti, Rama, Marra, Marco A., Marra, Robert E., Mathewson, Carrie A., Mitchell, Thomas G., Pertea, Mihaela, Riggs, Florenta R., Salzberg, Steven L., Schein, Jacqueline E., Shvartsbeyn, Alla, Shin, Heesun, Shumway, Martin, Specht, Charles A., Suh, Bernard B., Tenney, Aaron, Utterback, Terry R., Wickes, Brian L., Wortman, Jennifer R., Wye, Natasja H., Kronstad, James W., Lodge, Jennifer K., Heitman, Joseph, Davis, Ronald W., Fraser, Claire M., and Hyman, Richard W.
- Published
- 2005
21. Clustering Protein Binding Pockets and Identifying Potential Drug Interactions: A Novel Ligand-Based Featurization Method
- Author
-
Stevenson, Garrett A., primary, Kirshner, Dan, additional, Bennion, Brian J., additional, Yang, Yue, additional, Zhang, Xiaohua, additional, Zemla, Adam, additional, Torres, Marisa W., additional, Epstein, Aidan, additional, Jones, Derek, additional, Kim, Hyojin, additional, Bennett, W. F. Drew, additional, Wong, Sergio E., additional, Allen, Jonathan E., additional, and Lightstone, Felice C., additional
- Published
- 2023
- Full Text
- View/download PDF
22. Enhancing Docking Accuracy with PECAN2, a 3D Atomic Neural Network Trained without Co-Complex Crystal Structures.
- Author
-
Shim, Heesung, Allen, Jonathan E., and Bennett, W. F. Drew
- Subjects
OPIOID receptors ,CRYSTAL structure ,MOLECULAR docking ,DIGITAL libraries ,POINT cloud ,DRUG development - Abstract
Decades of drug development research have explored a vast chemical space for highly active compounds. The exponential growth of virtual libraries enables easy access to billions of synthesizable molecules. Computational modeling, particularly molecular docking, utilizes physics-based calculations to prioritize molecules for synthesis and testing. Nevertheless, the molecular docking process often yields docking poses with favorable scores that prove to be inaccurate with experimental testing. To address these issues, several approaches using machine learning (ML) have been proposed to filter incorrect poses based on the crystal structures. However, most of the methods are limited by the availability of structure data. Here, we propose a new pose classification approach, PECAN2 (Pose Classification with 3D Atomic Network 2), without the need for crystal structures, based on a 3D atomic neural network with Point Cloud Network (PCN). The new approach uses the correlation between docking scores and experimental data to assign labels, instead of relying on the crystal structures. We validate the proposed classifier on multiple datasets including human mu, delta, and kappa opioid receptors and SARS-CoV-2 Mpro. Our results demonstrate that leveraging the correlation between docking scores and experimental data alone enhances molecular docking performance by filtering out false positives and false negatives. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
23. Differential laboratory passaging of SARS-CoV-2 viral stocks impacts the in vitro assessment of neutralizing antibodies
- Author
-
Avila-Herrera, Aram, primary, Kimbrel, Jeffrey A., additional, Marti, Jose Manuel, additional, Thissen, James, additional, Saada, Edwin A., additional, Weisenberger, Tracy, additional, Arrildt, Kathryn T., additional, Segelke, Brent, additional, Allen, Jonathan E., additional, Zemla, Adam, additional, and Borucki, Monica K., additional
- Published
- 2023
- Full Text
- View/download PDF
24. A Computational Pipeline to Identify and Characterize Binding Sites and Interacting Chemotypes in SARS-CoV-2
- Author
-
Sandholtz, Sarah H., primary, Drocco, Jeffrey A., additional, Zemla, Adam T., additional, Torres, Marisa W., additional, Silva, Mary S., additional, and Allen, Jonathan E., additional
- Published
- 2023
- Full Text
- View/download PDF
25. Clustering Protein Binding Pockets and Identifying Potential Drug Interactions: A Novel Ligand-based Featurization Method
- Author
-
Stevenson, Garrett A, primary, Kirshner, Dan, additional, Bennion, Brian J, additional, Yang, Yue, additional, Zhang, Xiaohua, additional, Zemla, Adam, additional, Torres, Marisa W, additional, Epstein, Aidan, additional, Jones, Derek, additional, Kim, Hyojin, additional, Bennett, W.F. D., additional, Wong, Sergio, additional, Allen, Jonathan E., additional, and Lightstone, Felice C., additional
- Published
- 2023
- Full Text
- View/download PDF
26. Differential laboratory passaging of SARS-CoV-2 viral stocks impacts the in vitro assessment of neutralizing antibodies.
- Author
-
Avila-Herrera, Aram, Kimbrel, Jeffrey A., Manuel Martí, Jose, Thissen, James, Saada, Edwin A., Weisenberger, Tracy, Arrildt, Kathryn T., Segelke, Brent W., Allen, Jonathan E., Zemla, Adam, and Borucki, Monica K.
- Subjects
MONOCLONAL antibodies ,VIRAL antibodies ,IMMUNOGLOBULINS ,SARS-CoV-2 ,IMMUNE serums ,SARS-CoV-2 Omicron variant ,RECOMBINANT viruses - Abstract
Viral populations in natural infections can have a high degree of sequence diversity, which can directly impact immune escape. However, antibody potency is often tested in vitro with a relatively clonal viral populations, such as laboratory virus or pseudotyped virus stocks, which may not accurately represent the genetic diversity of circulating viral genotypes. This can affect the validity of viral phenotype assays, such as antibody neutralization assays. To address this issue, we tested whether recombinant virus carrying SARS-CoV-2 spike (VSV-SARS-CoV-2-S) stocks could be made more genetically diverse by passage, and if a stock passaged under selective pressure was more capable of escaping monoclonal antibody (mAb) neutralization than unpassaged stock or than viral stock passaged without selective pressures. We passaged VSV-SARS-CoV-2-S four times concurrently in three cell lines and then six times with or without polyclonal antiserum selection pressure. All three of the monoclonal antibodies tested neutralized the viral population present in the unpassaged stock. The viral inoculum derived from serial passage without antiserum selection pressure was neutralized by two of the three mAbs. However, the viral inoculum derived from serial passage under antiserum selection pressure escaped neutralization by all three mAbs. Deep sequencing revealed the rapid acquisition of multiple mutations associated with antibody escape in the VSV-SARS-CoV-2-S that had been passaged in the presence of antiserum, including key mutations present in currently circulating Omicron subvariants. These data indicate that viral stock that was generated under polyclonal antiserum selection pressure better reflects the natural environment of the circulating virus and may yield more biologically relevant outcomes in phenotypic assays. Thus, mAb assessment assays that utilize a more genetically diverse, biologically relevant, virus stock may yield data that are relevant for prediction of mAb efficacy and for enhancing biosurveillance. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
27. Metagenomic Methods for Addressing NASA's Planetary Protection Policy Requirements on Future Missions: A Workshop Report
- Author
-
Green, Stefan J., primary, Torok, Tamas, additional, Allen, Jonathan E., additional, Eloe-Fadrosh, Emiley, additional, Jackson, Scott A., additional, Jiang, Sunny C., additional, Levine, Stuart S., additional, Levy, Shawn, additional, Schriml, Lynn M., additional, Thomas, W. Kelley, additional, Wood, Jason M., additional, and Tighe, Scott W., additional
- Published
- 2023
- Full Text
- View/download PDF
28. Metagenomic Analysis of the Airborne Environment in Urban Spaces
- Author
-
Be, Nicholas A., Thissen, James B., Fofanov, Viacheslav Y., Allen, Jonathan E., Rojas, Mark, Golovko, George, Fofanov, Yuriy, Koshinsky, Heather, and Jaing, Crystal J.
- Published
- 2015
29. PDBspheres: a method for finding 3D similarities in local regions in proteins
- Author
-
Zemla, Adam T, primary, Allen, Jonathan E, additional, Kirshner, Dan, additional, and Lightstone, Felice C, additional
- Published
- 2022
- Full Text
- View/download PDF
30. Model Choice Metrics to Optimize Profile-QSAR Performance
- Author
-
He, Stewart, primary, Kim, Sookyung, additional, McLoughlin, Kevin S., additional, Ranganathan, Hiranmayi, additional, Shi, Da, additional, and Allen, Jonathan E., additional
- Published
- 2022
- Full Text
- View/download PDF
31. Microbial Characteristics of ISS Environmental Surfaces
- Author
-
Venkateswaran, Kasthuri, Checinska, Aleksandra, Singh, Nitin, Mohan, Ganesh B. M, Urbaniak, Camilla, Blachowicz, Adriana, Fox, George E, Jaing, Crystal, Allen, Jonathan E, Frey, Kenneth, Smith, David, Mehta, Satish, Bergman, Nicholas, Karouia, Fathi, Wang, Clay, Keller, Nancy, Pierson, Duane L, and Perry, Jay
- Published
- 2017
32. Microbial Characteristics of ISS Environmental Surfaces
- Author
-
Perry, Jay, Pierson, Duane L, Keller, Nancy, Wang, Clay, Karouia, Fathi, Bergman, Nicholas, Mehta, Satish, Smith, David, Frey, Kenneth, Allen, Jonathan E, Jaing, Crystal, Fox, George E, Blachowicz, Adriana, Urbaniak, Camilla, Mohan, Ganesh B. M, Singh, Nitin, Checinska, Aleksandra, and Venkateswaran, Kasthuri
- Abstract
The microbiome of environmental surfaces from the International Space Station were characterized in order to examine the relationship to crew and hardware maintenance. The Microbial Observatory (ISS-MO) experiment generated a microbial census of ISS environments using advanced molecular microbial community analyses along with traditional culture-based methods. Since the “omics” methodologies generated an extensive microbial census, significant insights into spaceflight-induced changes in the populations of beneficial and/or potentially harmful microbes were gained. Surface samples were collected from several ISS surface locations from three flight opportunities, and were returned to Earth via the Soyuz TMA-14M or the Space X Dragon capsule. In addition to cultivation methods, viable microbial burden, iTag-based sequencing, and metagenome analyses were carried out. The cultivable microbial bioburden differed by location and sampling event. Exploring the ISS environmental microbiome revealed presence of opportunistic pathogens and antibiotic resistant microbes. Genes involved in ATP binding cassette transporters, two component systems, and beta-lactam resistance were among a diverse set of metabolic and genetic information processing pathways. Whole genome sequencing (WGS) of 50 ISS strains exhibiting resistance to various antibiotics was carried out. The antibiotic resistant genes deduced from the WGS were compared with the resistomes generated directly from the gene pool of the environmental samples. Two unique Aspergillus fumigatus strains isolated from the ISS were characterized and compared to the experimentally established clinical isolates Af293 and CEA10. A virulence assessment in a neutrophil-deficient larval zebrafish model of invasive aspergillosis indicated that both ISSFT-021 and IF1SW-F4 were significantly more lethal compared to Af293 and CEA10. The findings from this Environmental “Omics” project should be exploited to enhance human health and well-being of a closed system. In other words, the ISS-MO research aims to "translate" findings in fundamental research into medical practice (pathogen detection) and meaningful health outcomes (countermeasure development).
- Published
- 2017
33. Draft Genome of the Filarial Nematode Parasite Brugia malayi
- Author
-
Ghedin, Elodie, Wang, Shiliang, Spiro, David, Caler, Elisabet, Zhao, Qi, Crabtree, Jonathan, Allen, Jonathan E., Delcher, Arthur L., Guiliano, David B., Miranda-Saavedra, Diego, Angiuoli, Samuel V., Creasy, Todd, Amedeo, Paolo, Haas, Brian, El-Sayed, Najib M., Wortman, Jennifer R., Feldblyum, Tamara, Tallon, Luke, Schatz, Michael, Shumway, Martin, Koo, Hean, Salzberg, Steven L., Schobel, Seth, Pertea, Mihaela, Pop, Mihai, White, Owen, Barton, Geoffrey J., Carlow, Clotilde K. S., Crawford, Michael J., Daub, Jennifer, Dimmic, Matthew W., Estes, Chris F., Foster, Jeremy M., Ganatra, Mehul, Gregory, William F., Johnson, Nicholas M., Jin, Jinming, Komuniecki, Richard, Korf, Ian, Kumar, Sanjay, Laney, Sandra, Li, Ben-Wen, Li, Wen, Lindblom, Tim H., Lustigman, Sara, Ma, Dong, Maina, Claude V., Martin, David M. A., McCarter, James P., McReynolds, Larry, Mitreva, Makedonka, Nutman, Thomas B., Parkinson, John, Peregrín-Alvarez, José M., Poole, Catherine, Ren, Qinghu, Saunders, Lori, Sluder, Ann E., Smith, Katherine, Stanke, Mario, Unnasch, Thomas R., Ware, Jenna, Wei, Aguan D., Weil, Gary, Williams, Deryck J., Zhang, Yinhua, Williams, Steven A., Fraser-Liggett, Claire, Slatko, Barton, Blaxter, Mark L., and Scott, Alan L.
- Published
- 2007
- Full Text
- View/download PDF
34. A Computational Pipeline to Identify and Characterize Binding Sites and Interacting Chemotypes in SARS-CoV-2
- Author
-
Sandholtz, Sarah H., primary, Drocco, Jeffrey A., additional, Zemla, Adam T., additional, Torres, Marisa W., additional, Silva, Mary S., additional, and Allen, Jonathan E., additional
- Published
- 2022
- Full Text
- View/download PDF
35. MACAW: an accessible tool for molecular embedding and inverse molecular design
- Author
-
Blay, Vincent, primary, Radivojevic, Tijana, additional, Allen, Jonathan E., additional, Hudson, Corey M., additional, and Garcia-Martin, Hector, additional
- Published
- 2022
- Full Text
- View/download PDF
36. PDBspheres - a method for finding 3D similarities in local regions in proteins
- Author
-
Zemla, Adam T., primary, Allen, Jonathan E., additional, Kirshner, Dan, additional, and Lightstone, Felice C., additional
- Published
- 2022
- Full Text
- View/download PDF
37. High-throughput virtual screening of small molecule inhibitors for SARS-CoV-2 protein targets with deep fusion models
- Author
-
Stevenson, Garrett A., primary, Jones, Derek, additional, Kim, Hyojin, additional, Bennett, W. F. Drew, additional, Bennion, Brian J., additional, Borucki, Monica, additional, Bourguet, Feliza, additional, Epstein, Aidan, additional, Franco, Magdalena, additional, Harmon, Brooke, additional, He, Stewart, additional, Katz, Max P., additional, Kirshner, Daniel, additional, Lao, Victoria, additional, Lau, Edmond Y., additional, Lo, Jacky, additional, McLoughlin, Kevin, additional, Mosesso, Richard, additional, Murugesh, Deepa K., additional, Negrete, Oscar A., additional, Saada, Edwin A., additional, Segelke, Brent, additional, Stefan, Maxwell, additional, Torres, Marisa W., additional, Weilhammer, Dina, additional, Wong, Sergio, additional, Yang, Yue, additional, Zemla, Adam, additional, Zhang, Xiaohua, additional, Zhu, Fangqiang, additional, Lightstone, Felice C., additional, and Allen, Jonathan E., additional
- Published
- 2021
- Full Text
- View/download PDF
38. Discovery of Small-Molecule Inhibitors of SARS-CoV-2 Proteins Using a Computational and Experimental Pipeline
- Author
-
Lau, Edmond Y., primary, Negrete, Oscar A., additional, Bennett, W. F. Drew, additional, Bennion, Brian J., additional, Borucki, Monica, additional, Bourguet, Feliza, additional, Epstein, Aidan, additional, Franco, Magdalena, additional, Harmon, Brooke, additional, He, Stewart, additional, Jones, Derek, additional, Kim, Hyojin, additional, Kirshner, Daniel, additional, Lao, Victoria, additional, Lo, Jacky, additional, McLoughlin, Kevin, additional, Mosesso, Richard, additional, Murugesh, Deepa K., additional, Saada, Edwin A., additional, Segelke, Brent, additional, Stefan, Maxwell A., additional, Stevenson, Garrett A., additional, Torres, Marisa W., additional, Weilhammer, Dina R., additional, Wong, Sergio, additional, Yang, Yue, additional, Zemla, Adam, additional, Zhang, Xiaohua, additional, Zhu, Fangqiang, additional, Allen, Jonathan E., additional, and Lightstone, Felice C., additional
- Published
- 2021
- Full Text
- View/download PDF
39. Enabling rapid COVID-19 small molecule drug design through scalable deep learning of generative models
- Author
-
Jacobs, Sam Ade, primary, Moon, Tim, additional, McLoughlin, Kevin, additional, Jones, Derek, additional, Hysom, David, additional, Ahn, Dong H, additional, Gyllenhaal, John, additional, Watson, Pythagoras, additional, Lightstone, Felice C, additional, Allen, Jonathan E, additional, Karlin, Ian, additional, and Van Essen, Brian, additional
- Published
- 2021
- Full Text
- View/download PDF
40. Improved Protein–Ligand Binding Affinity Prediction with Structure-Based Deep Fusion Inference
- Author
-
Jones, Derek, primary, Kim, Hyojin, additional, Zhang, Xiaohua, additional, Zemla, Adam, additional, Stevenson, Garrett, additional, Bennett, W. F. Drew, additional, Kirshner, Daniel, additional, Wong, Sergio E., additional, Lightstone, Felice C., additional, and Allen, Jonathan E., additional
- Published
- 2021
- Full Text
- View/download PDF
41. Scalable metagenomic taxonomy classification using a reference genome database
- Author
-
Ames, Sasha K., Hysom, David A., Gardner, Shea N., Lloyd, Scott G., Gokhale, Maya B., and Allen, Jonathan E.
- Published
- 2013
- Full Text
- View/download PDF
42. Genome sequence and comparative analysis of the model rodent malaria parasite Plasmodium yoelii yoelii
- Author
-
Carlton, Jane M., Angiuoli, Samuel V., Suh, Bernard B., Kooij, Taco W., Pertea, Mihaela, Silva, Joana C., Ermolaeva, Maria D., Allen, Jonathan E., Selengut, Jeremy D., Koo, Hean L., Peterson, Jeremy D., Pop, Mihai, Kosack, Daniel S., Shumway, Martin F., Bidwell, Shelby L., Shallom, Shamira J., van Aken, Susan E., Riedmuller, Steven B., Feldblyum, Tamara V., Cho, Jennifer K., Quackenbush, John, Sedegah, Martha, Shoaibi, Azadeh, Cummings, Leda M., Florens, Laurence, Yates, John R., Raine, J. Dale, Sinden, Robert E., Harris, Michael A., Cunningham, Deirdre A., Preiser, Peter R., Bergman, Lawrence W., Vaidya, Akhil B., van Lin, Leo H., Janse, Chris J., Waters, Andrew P., Smith, Hamilton O., White, Owen R., Salzberg, Steven L., Venter, J. Craig, Fraser, Claire M., Hoffman, Stephen L., Gardner, Malcolm J., and Carucci, Daniel J.
- Published
- 2002
- Full Text
- View/download PDF
43. Machine Learning Models to Predict Inhibition of the Bile Salt Export Pump
- Author
-
McLoughlin, Kevin S., primary, Jeong, Claire G., additional, Sweitzer, Thomas D., additional, Minnich, Amanda J., additional, Tse, Margaret J., additional, Bennion, Brian J., additional, Allen, Jonathan E., additional, Calad-Thomson, Stacie, additional, Rush, Thomas S., additional, and Brase, James M., additional
- Published
- 2021
- Full Text
- View/download PDF
44. Predicting Small Molecule Transfer Free Energies by Combining Molecular Dynamics Simulations and Deep Learning
- Author
-
Bennett, W. F. Drew, primary, He, Stewart, additional, Bilodeau, Camille L., additional, Jones, Derek, additional, Sun, Delin, additional, Kim, Hyojin, additional, Allen, Jonathan E., additional, Lightstone, Felice C., additional, and Ingólfsson, Helgi I., additional
- Published
- 2020
- Full Text
- View/download PDF
45. Binding Affinity Prediction by Pairwise Function Based on Neural Network
- Author
-
Zhu, Fangqiang, primary, Zhang, Xiaohua, additional, Allen, Jonathan E., additional, Jones, Derek, additional, and Lightstone, Felice C., additional
- Published
- 2020
- Full Text
- View/download PDF
46. AMPL: A Data-Driven Modeling Pipeline for Drug Discovery
- Author
-
Minnich, Amanda J., primary, McLoughlin, Kevin, additional, Tse, Margaret, additional, Deng, Jason, additional, Weber, Andrew, additional, Murad, Neha, additional, Madej, Benjamin D., additional, Ramsundar, Bharath, additional, Rush, Tom, additional, Calad-Thomson, Stacie, additional, Brase, Jim, additional, and Allen, Jonathan E., additional
- Published
- 2020
- Full Text
- View/download PDF
47. JIGSAW: integration of multiple sources of evidence for gene prediction
- Author
-
Allen, Jonathan E. and Salzberg, Steven L.
- Published
- 2005
48. Navigating rough waters: the VOIP opportunity for real estate
- Author
-
Allen, Jonathan E.
- Subjects
VoIP (Network protocol) -- Usage ,Real estate industry -- Technology application ,Voice-over-IP gateway ,Voice over IP ,Technology application ,Banking, finance and accounting industries ,Business ,Real estate industry - Abstract
The advantages and challanges that the Voice Over Internet Protocol (VOIP) technology provides for the real estate sector are presented.
- Published
- 2004
49. GovMath
- Author
-
Allen, Jonathan E., primary and Jaing, Crystal, additional
- Published
- 2019
- Full Text
- View/download PDF
50. Conserved amino acid markers from past influenza pandemic strains
- Author
-
Vitalis Elizabeth A, Gardner Shea N, Allen Jonathan E, and Slezak Tom R
- Subjects
Microbiology ,QR1-502 - Abstract
Abstract Background Finding the amino acid mutations that affect the severity of influenza infections remains an open and challenging problem. Of special interest is better understanding how current circulating influenza strains could evolve into a new pandemic strain. Influenza proteomes from distinct viral phenotype classes were searched for class specific amino acid mutations conserved in past pandemics, using reverse engineered linear classifiers. Results Thirty-four amino acid markers associated with host specificity and high mortality rate were found. Some markers had little impact on distinguishing the functional classes by themselves, however in combination with other mutations they improved class prediction. Pairwise combinations of influenza genomes were checked for reassortment and mutation events needed to acquire the pandemic conserved markers. Evolutionary pathways involving H1N1 human and swine strains mixed with avian strains show the potential to acquire the pandemic markers with a double reassortment and one or two amino acid mutations. Conclusion The small mutation combinations found at multiple protein positions associated with viral phenotype indicate that surveillance tools could monitor genetic variation beyond single point mutations to track influenza strains. Finding that certain strain combinations have the potential to acquire pandemic conserved markers through a limited number of reassortment and mutation events illustrates the potential for reassortment and mutation events to lead to new circulating influenza strains.
- Published
- 2009
- Full Text
- View/download PDF
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.