1. Machine Learning Models and Pathway Genome Data Base for Trypanosoma cruzi Drug Discovery
- Author
-
James H. McKerrow, Steven Chen, Danielle Kellar, Jair L. Siqueira-Neto, Michelle R. Arkin, Laura-Isobel McCall, Maneesh K. Yadav, E. Adam Kallel, Carolyn L. Talcott, Barry A. Bunin, Elizabeth L. Ponder, Sean Ekins, and Malabika Sarker
- Subjects
Chagas disease ,lcsh:Arctic medicine. Tropical medicine ,lcsh:RC955-962 ,Trypanosoma cruzi ,Drug target ,Computational biology ,Bioinformatics ,Genomic databases ,Genome ,Cell Line ,Machine Learning ,Mice ,parasitic diseases ,Drug Discovery ,medicine ,Animals ,Humans ,Chagas Disease ,Mice, Inbred BALB C ,biology ,Drug discovery ,Extramural ,lcsh:Public aspects of medicine ,Public Health, Environmental and Occupational Health ,Tropical disease ,Computational Biology ,lcsh:RA1-1270 ,Bayes Theorem ,biology.organism_classification ,medicine.disease ,Trypanocidal Agents ,3. Good health ,High-Throughput Screening Assays ,Disease Models, Animal ,Infectious Diseases ,Female ,Genome, Protozoan ,Metabolic Networks and Pathways ,Research Article - Abstract
Background Chagas disease is a neglected tropical disease (NTD) caused by the eukaryotic parasite Trypanosoma cruzi. The current clinical and preclinical pipeline for T. cruzi is extremely sparse and lacks drug target diversity. Methodology/Principal Findings In the present study we developed a computational approach that utilized data from several public whole-cell, phenotypic high throughput screens that have been completed for T. cruzi by the Broad Institute, including a single screen of over 300,000 molecules in the search for chemical probes as part of the NIH Molecular Libraries program. We have also compiled and curated relevant biological and chemical compound screening data including (i) compounds and biological activity data from the literature, (ii) high throughput screening datasets, and (iii) predicted metabolites of T. cruzi metabolic pathways. This information was used to help us identify compounds and their potential targets. We have constructed a Pathway Genome Data Base for T. cruzi. In addition, we have developed Bayesian machine learning models that were used to virtually screen libraries of compounds. Ninety-seven compounds were selected for in vitro testing, and 11 of these were found to have EC50 < 10μM. We progressed five compounds to an in vivo mouse efficacy model of Chagas disease and validated that the machine learning model could identify in vitro active compounds not in the training set, as well as known positive controls. The antimalarial pyronaridine possessed 85.2% efficacy in the acute Chagas mouse model. We have also proposed potential targets (for future verification) for this compound based on structural similarity to known compounds with targets in T. cruzi. Conclusions/ Significance We have demonstrated how combining chemoinformatics and bioinformatics for T. cruzi drug discovery can bring interesting in vivo active molecules to light that may have been overlooked. The approach we have taken is broadly applicable to other NTDs., Author Summary Chagas disease is a neglected tropical disease (NTD) caused by the eukaryotic parasite Trypanosoma cruzi. The disease is endemic to Latin America but is increasingly found in North America and Europe, primarily through immigration, and the spread of this disease is bringing new attention to the need for novel, safe, and effective therapeutics to treat T. cruzi infection. We have used data from a phenotypic screen to build Bayesian models to predict anti-parasitic activity against T. cruzi in vitro. These models were used to score various small libraries of molecules. We selected less than 100 compounds for testing and found in vitro actives, some of which were tested in an in vivo efficacy model. We identified the antimalarial pyronaridine as having in vivo efficacy and provides us with a new starting point for further investigation and optimization.
- Published
- 2015