12 results on '"Margolin, Adam A."'
Search Results
2. Improving breast cancer survival analysis through competition-based multidimensional modeling.
- Author
-
Bilal E, Dutkowski J, Guinney J, Jang IS, Logsdon BA, Pandey G, Sauerwine BA, Shimoni Y, Moen Vollan HK, Mecham BH, Rueda OM, Tost J, Curtis C, Alvarez MJ, Kristensen VN, Aparicio S, Børresen-Dale AL, Caldas C, Califano A, Friend SH, Ideker T, Schadt EE, Stolovitzky GA, and Margolin AA
- Subjects
- Algorithms, Cluster Analysis, Databases, Factual, Female, Gene Expression Profiling, Humans, Prognosis, Breast Neoplasms, Computational Biology methods, Models, Biological, Models, Statistical, Survival Analysis
- Abstract
Breast cancer is the most common malignancy in women and is responsible for hundreds of thousands of deaths annually. As with most cancers, it is a heterogeneous disease and different breast cancer subtypes are treated differently. Understanding the difference in prognosis for breast cancer based on its molecular and phenotypic features is one avenue for improving treatment by matching the proper treatment with molecular subtypes of the disease. In this work, we employed a competition-based approach to modeling breast cancer prognosis using large datasets containing genomic and clinical information and an online real-time leaderboard program used to speed feedback to the modeling team and to encourage each modeler to work towards achieving a higher ranked submission. We find that machine learning methods combined with molecular features selected based on expert prior knowledge can improve survival predictions compared to current best-in-class methodologies and that ensemble models trained across multiple user submissions systematically outperform individual models within the ensemble. We also find that model scores are highly consistent across multiple independent evaluations. This study serves as the pilot phase of a much larger competition open to the whole research community, with the goal of understanding general strategies for model optimization using clinical and molecular profiling data and providing an objective, transparent system for assessing prognostic models.
- Published
- 2013
- Full Text
- View/download PDF
3. ARACNE: an algorithm for the reconstruction of gene regulatory networks in a mammalian cellular context.
- Author
-
Margolin AA, Nemenman I, Basso K, Wiggins C, Stolovitzky G, Dalla Favera R, and Califano A
- Subjects
- Algorithms, Animals, B-Lymphocytes metabolism, Computer Simulation, Gene Expression Profiling, Humans, Models, Statistical, Neural Networks, Computer, Oligonucleotide Array Sequence Analysis, Phenotype, Proto-Oncogene Mas, Reproducibility of Results, Software, Transcription, Genetic, Computational Biology methods, Gene Expression Regulation
- Abstract
Background: Elucidating gene regulatory networks is crucial for understanding normal cell physiology and complex pathologic phenotypes. Existing computational methods for the genome-wide "reverse engineering" of such networks have been successful only for lower eukaryotes with simple genomes. Here we present ARACNE, a novel algorithm, using microarray expression profiles, specifically designed to scale up to the complexity of regulatory networks in mammalian cells, yet general enough to address a wider range of network deconvolution problems. This method uses an information theoretic approach to eliminate the majority of indirect interactions inferred by co-expression methods., Results: We prove that ARACNE reconstructs the network exactly (asymptotically) if the effect of loops in the network topology is negligible, and we show that the algorithm works well in practice, even in the presence of numerous loops and complex topologies. We assess ARACNE's ability to reconstruct transcriptional regulatory networks using both a realistic synthetic dataset and a microarray dataset from human B cells. On synthetic datasets ARACNE achieves very low error rates and outperforms established methods, such as Relevance Networks and Bayesian Networks. Application to the deconvolution of genetic networks in human B cells demonstrates ARACNE's ability to infer validated transcriptional targets of the cMYC proto-oncogene. We also study the effects of misestimation of mutual information on network reconstruction, and show that algorithms based on mutual information ranking are more resilient to estimation errors., Conclusion: ARACNE shows promise in identifying direct transcriptional interactions in mammalian cellular networks, a problem that has challenged existing reverse engineering algorithms. This approach should enhance our ability to use microarray data to elucidate functional mechanisms that underlie cellular processes and to identify molecular targets of pharmacological compounds in mammalian cellular networks.
- Published
- 2006
- Full Text
- View/download PDF
4. Reverse engineering cellular networks.
- Author
-
Margolin AA, Wang K, Lim WK, Kustagi M, Nemenman I, and Califano A
- Subjects
- Algorithms, B-Lymphocytes metabolism, Gene Expression Regulation, Humans, Proto-Oncogene Proteins c-myc genetics, Proto-Oncogene Proteins c-myc metabolism, Software, Transcription, Genetic, Computational Biology methods, Gene Expression Profiling methods, Oligonucleotide Array Sequence Analysis methods
- Abstract
We describe a computational protocol for the ARACNE algorithm, an information-theoretic method for identifying transcriptional interactions between gene products using microarray expression profile data. Similar to other algorithms, ARACNE predicts potential functional associations among genes, or novel functions for uncharacterized genes, by identifying statistical dependencies between gene products. However, based on biochemical validation, literature searches and DNA binding site enrichment analysis, ARACNE has also proven effective in identifying bona fide transcriptional targets, even in complex mammalian networks. Thus we envision that predictions made by ARACNE, especially when supplemented with prior knowledge or additional data sources, can provide appropriate hypotheses for the further investigation of cellular networks. While the examples in this protocol use only gene expression profile data, the algorithm's theoretical basis readily extends to a variety of other high-throughput measurements, such as pathway-specific or genome-wide proteomics, microRNA and metabolomics data. As these data become readily available, we expect that ARACNE might prove increasingly useful in elucidating the underlying interaction models. For a microarray data set containing approximately 10,000 probes, reconstructing the network around a single probe completes in several minutes using a desktop computer with a Pentium 4 processor. Reconstructing a genome-wide network generally requires a computational cluster, especially if the recommended bootstrapping procedure is used.
- Published
- 2006
- Full Text
- View/download PDF
5. Boolean calculations made easy (for ribozymes).
- Author
-
Margolin AA and Stojanovic MN
- Subjects
- Algorithms, Amino Acid Motifs, Animals, Automation, Computer Simulation, Humans, Mathematical Computing, Mice, Oligonucleotides chemistry, Programming Languages, Software, Computational Biology methods, RNA, Catalytic chemistry
- Published
- 2005
- Full Text
- View/download PDF
6. The NIH BD2K center for big data in translational genomics
- Author
-
Paten, Benedict, Diekhans, Mark, Druker, Brian J, Friend, Stephen, Guinney, Justin, Gassner, Nadine, Guttman, Mitchell, Kent, W James, Mantey, Patrick, Margolin, Adam A, Massie, Matt, Novak, Adam M, Nothaft, Frank, Pachter, Lior, Patterson, David, Smuga-Otto, Maciej, Stuart, Joshua M, Veer, Laura Van’t, Wold, Barbara, and Haussler, David
- Subjects
Distributed Computing and Systems Software ,Information and Computing Sciences ,Human Genome ,Networking and Information Technology R&D (NITRD) ,Genetics ,Biotechnology ,Generic health relevance ,Good Health and Well Being ,Computational Biology ,Datasets as Topic ,Genomics ,Humans ,Knowledge Bases ,National Institutes of Health (U.S.) ,Translational Research ,Biomedical ,United States ,computational genomics ,genomics ,big data ,APIs ,genome informatics ,Engineering ,Medical and Health Sciences ,Medical Informatics ,Biomedical and clinical sciences ,Health sciences ,Information and computing sciences - Abstract
The world's genomics data will never be stored in a single repository - rather, it will be distributed among many sites in many countries. No one site will have enough data to explain genotype to phenotype relationships in rare diseases; therefore, sites must share data. To accomplish this, the genetics community must forge common standards and protocols to make sharing and computing data among many sites a seamless activity. Through the Global Alliance for Genomics and Health, we are pioneering the development of shared application programming interfaces (APIs) to connect the world's genome repositories. In parallel, we are developing an open source software stack (ADAM) that uses these APIs. This combination will create a cohesive genome informatics ecosystem. Using containers, we are facilitating the deployment of this software in a diverse array of environments. Through benchmarking efforts and big data driver projects, we are ensuring ADAM's performance and utility.
- Published
- 2015
7. A harmonized meta-knowledgebase of clinical interpretations of cancer genomic variants
- Author
-
Wagner, Alex H, Walsh, Brian, Mayfield, Georgia, Tamborero, David, Sonkin, Dmitriy, Krysiak, Kilannin, Pons, Jordi Deu, Duren, Ryan P, Gao, Jianjiong, McMurry, Julie, Patterson, Sara, Del Vecchio Fitz, Catherine, Sezerman, Ozman U, Warner, Jeremy L, Rieke, Damian T, Aittokallio, Tero, Cerami, Ethan, Ritter, Deborah, Schriml, Lynn M, Freimuth, Robert R, Haendel, Melissa, Raca, Gordana, Madhavan, Subha, Baudis, Michael, Beckmann, Jacques S, Dienstmann, Rodrigo, Chakravarty, Debyani, Li, Xuan Shirley, Mockus, Susan, Elemento, Olivier, Schultz, Nikolaus, Lopez-Bigas, Nuria, Lawler, Mark, Goecks, Jeremy, Griffith, Malachi, Griffith, Obi L, and Margolin, Adam A
- Subjects
Structure (mathematical logic) ,0303 health sciences ,Matching (statistics) ,Interpretation (philosophy) ,Cancer ,Genomics ,Computational biology ,Biology ,medicine.disease ,3. Good health ,03 medical and health sciences ,0302 clinical medicine ,Precision oncology ,030220 oncology & carcinogenesis ,medicine ,Relevance (information retrieval) ,030304 developmental biology - Abstract
Precision oncology relies on the accurate discovery and interpretation of genomic variants to enable individualized diagnosis, prognosis, and therapy selection. We found that knowledgebases containing clinical interpretations of somatic cancer variants are highly disparate in interpretation content, structure, and supporting primary literature, impeding consensus when evaluating variants and their relevance in a clinical setting. With the cooperation of experts of the Global Alliance for Genomics and Health (GA4GH) and six prominent cancer variant knowledgebases, we developed a framework for aggregating and harmonizing variant interpretations to produce a meta-knowledgebase of 12,856 aggregate interpretations covering 3,437 unique variants in 415 genes, 357 diseases, and 791 drugs. We demonstrated large gains in overlap between resources across variants, diseases, and drugs as a result of this harmonization. We subsequently demonstrated improved matching between a patient cohort and harmonized interpretations of potential clinical significance, observing an increase from an average of 33% per individual knowledgebase to 56% in aggregate. Our analyses illuminate the need for open, interoperable sharing of variant interpretation data. We also provide an open and freely available web interface (search.cancervariants.org) for exploring the harmonized interpretations from these six knowledgebases.
- Published
- 2018
- Full Text
- View/download PDF
8. Multiplatform analysis of 12 cancer types reveals molecular classification within and across tissues of origin
- Author
-
Hoadley, Katherine A., Yau, Christina, Wolf, Denise M., Cherniack, Andrew D., Tamborero, David, Sam, Ng, Leiserson, Max D. M., Niu, Beifang, Mclellan, Michael D., Uzunangelov, Vladislav, Zhang, Jiashan, Kandoth, Cyriac, Akbani, Rehan, Shen, Hui, Omberg, Larsson, Chu, Andy, Margolin, Adam A., Van'T Veer, Laura J., Lopez Bigas, Nuria, Laird, Peter W., Raphael, Benjamin J., Ding, Li, Robertson, A. Gordon, Byers, Lauren A., Mills, Gordon B., Weinstein, John N., Van Waes, Carter, Chen, Zhong, Collisson, Eric A., Benz, Christopher C, Perou, Charles M., Stuart, Joshua M., Rachel, Abbott, Scott, Abbott, Arman Aksoy, B., Kenneth, Aldape, Adrian, Ally, Samirku mar Amin, Dimitris, Anastassiou, Todd Auman, J., Baggerly, Keith A., Miruna, Balasundaram, Saianand, Balu, Baylin, Stephen B., Benz, Stephen C., Berman, Benjamin P., Brady, Bernard, Bhatt, Ami S., Inanc, Birol, Black, Aaron D., Tom, Bodenheimer, Bootwalla, Moiz S., Jay, Bowen, Ryan, Bressler, Bristow, Christopher A., Brooks, Angela N., Bradley, Broom, Elizabeth, Buda, Robert, Burton, Butterfield, Yaron S. N., Daniel, Carlin, Carter, Scott L., Casasent, Tod D., Kyle, Chang, Stephen, Chanock, Lynda, Chin, Dong Yeon Cho, Juok, Cho, Eric, Chuah, Chun, Hye Jung E., Kristian, Cibulskis, Giovanni, Ciriello, James Cle land, Melisssa, Cline, Brian, Craft, Creighton, Chad J., Ludmila, Danilova, Tanja, Davidsen, Caleb, Davis, Dees, Nathan D., Kim, Delehaunty, Demchok, John A., Noreen, Dhalla, Daniel, Dicara, Huyen, Dinh, Dobson, Jason R., Deepti, Dodda, Harshavardhan, Doddapaneni, Lawrence, Donehower, Dooling, David J., Gideon, Dresdner, Jennifer, Drummond, Andrea, Eakin, Mary, Edgerton, Eldred, Jim M., Greg, Eley, Kyle, Ellrott, Cheng, Fan, Suzanne, Fei, Ina, Felau, Scott, Frazer, Freeman, Samuel S., Jessica, Frick, Fronick, Catrina C., Ful ton, Lucinda L., Robert, Fulton, Gabriel, Stacey B., Jianjiong, Gao, Gastier Foster, Julie M., Nils, Gehlenborg, Myra, George, Gad, Getz, Richard, Gibbs, Mary, Goldman, Abel Gonzalez Perez, Benjamin, Gross, Ranabir, Guin, Preethi, Gunaratne, Angela, Hadjipanayis, Hamilton, Mark P., Hamilton, Stanley R., Leng, Han, Han, Yi, Harper, Hollie A., Psalm, Haseley, David, Haussler, Neil Hayes, D., Heiman, David I., Elena, Helman, Carmen, Helsel, Herbrich, Shelley M., Her man, James G., Toshinori, Hinoue, Carrie, Hirst, Martin, Hirst, Holt, Robert A., Hoyle, Alan P., Lisa, Iype, Anders, Jacobsen, Jeffreys, Stuart R., Jensen, Mark A., Jones, Corbin D., Jones, Steven J. M., Zhenlin, Ju, Joonil, Jung, Andre, Kahles, Ari, Kahn, Joelle Kalicki Veizer, Divya, Kalra, Krishna Latha Kanchi, Kane, David W., Hoon, Kim, Jaegil, Kim, Theo, Knijnenburg, Koboldt, Daniel C., Christie, Kovar, Roger, Kramer, Richard, Kreisberg, Raju, Kucherlapati, Marc, Ladanyi, Lander, Eric S., Larson, David E., Lawrence, Michael S., Darlene, Lee, Eunjung, Lee, Semin, Lee, William, Lee, Kjong Van Lehmann, Kalle, Leinonen, Ler aas, Kristen M., Seth, Lerner, Levine, Douglas A., Lora, Lewis, Ley, Timothy J., Haiyan I., Li, Jun, Li, Wei, Li, Han, Liang, Lichtenberg, Tara M., Jake, Lin, Ling, Lin, Pei, Lin, Wen bin Liu, Yingchun, Liu, Yuexin, Liu, Lorenzi, Philip L., Charles, Lu, Yiling, Lu, Luquette, Love lace J., Singer, Ma, Magrini, Vincent J., Mahadeshwar, Harshad S., Mardis, Elaine R., Adam, Margolin, Marra, Marco A., Michael, Mayo, Cynthia, Mcallister, Mcguire, Sean E., Mcmichael, Joshua F., James, Melott, Shaowu, Meng, Matthew, Meyerson, Mieczkowski, Piotr A., Miller, Christopher A., Miller, Martin L., Michael, Miller, Moore, Richard A., Margaret, Morgan, Donna, Morton, Mose, Lisle E., Mungall, Andrew J., Donna, Muzny, Lam, Nguyen, Noble, Michael S., Houtan, Noushmehr, Michelle, O’Laughlin, Ojesina, Akinyemi I., Tai Hsien Ou Yang, Brad, Ozenberger, Angeliki, Pantazi, Michael, Parfenov, Park, Peter J., Parker, Joel S., Evan, Paull, Chandra Sekhar Pedamallu, Todd, Pihl, Craig, Pohl, David, Pot, Alexei, Protopopov, Teresa, Przytycka, Amie Raden baugh, Ramirez, Nilsa C., Ricardo, Ramirez, Gunnar Ra, ̈ tsch, Jeffrey, Reid, Xiao jia Ren, Boris, Reva, Reynolds, Sheila M., Rhie, Suhn K., Jeffrey, Roach, Hector, Rovira, Michael, Ryan, Gordon, Saksena, Sofie, Salama, Chris, Sander, Netty, Santoso, Schein, Jacqueline E., Heather, Schmidt, Nikolaus, Schultz, Schumacher, Steven E., Jonathan, Seidman, Yasin, Senbabaoglu, Sahil, Seth, Saman tha Sharpe, Ronglai, Shen, Margi, Sheth, Yan, Shi, Ilya, Shmulevich, Silva, Grace O., Simons, Janae V., Rileen, Sinha, Payal, Sipahimalani, Smith, Scott M., Sofia, Heidi J., Artem, Sokolov, Soloway, Mathew G., Xingzhi, Song, Carrie Soug nez, Paul, Spellman, Louis, Staudt, Chip, Stewart, Petar, Stojanov, Xiaoping, Su, Onur Sumer, S., Yichao, Sun, Teresa, Swatloski, Barbara, Tabak, Angela, Tam, Donghui, Tan, Jiabin, Tang, Roy, Tarnuzzer, Taylor, Barry S., Nina, Thiessen, Ves teinn Thorsson, Timothy Triche, J. r., Van Den Berg, David J., Vandin, Fabio, Varhol, Richard J., Vaske, Charles J., Umadevi, Veluvolu, Roeland, Verhaak, Doug, Voet, Jason, Walker, Wallis, John W., Peter, Waltman, Yunhu, Wan, Min, Wang, Wenyi, Wang, Zhining, Wang, Scot, Waring, Nils, Weinhold, Weisenberger, Daniel J., Wendl, Michael C., David, Wheeler, Wilkerson, Matthew D., Wilson, Richard K., Lisa, Wise, Andrew, Wong, Chang Jiun Wu, Chia Chin Wu, Hsin Ta Wu, Junyuan, Wu, Todd, Wylie, Liu, Xi, Ruibin, Xi, Zheng, Xia, Andrew W., Xu, Yang, Da, Liming, Yang, Lixing, Yang, Yang, Yang, Jun, Yao, Rong, Yao, Kai, Ye, Ko suke Yoshihara, Yuan, Yuan, Yung, Alfred K., Travis, Zack, Dong, Zeng, Jean Claude Zenklusen, Hailei, Zhang, Jianhua, Zhang, Nianxiang, Zhang, Qunyuan, Zhang, Wei, Zhang, Wei, Zhao, Siyuan, Zheng, Jing, Zhu, Erik, Zmuda, and Lihua, Zou
- Subjects
Genetics and Molecular Biology (all) ,Cluster Analysis ,Humans ,Neoplasms ,Transcriptome ,Biochemistry, Genetics and Molecular Biology (all) ,Extramural ,Biochemistry, Genetics and Molecular Biology(all) ,Cancer ,Computational biology ,Disease ,Biology ,medicine.disease ,Bioinformatics ,Biochemistry ,General Biochemistry, Genetics and Molecular Biology ,Article ,3. Good health ,Molecular classification ,TP63 ,CLUSTERS (ANÁLISE) ,medicine ,Head and neck ,Gene - Abstract
Summary Recent genomic analyses of pathologically defined tumor types identify "within-a-tissue" disease subtypes. However, the extent to which genomic signatures are shared across tissues is still unclear. We performed an integrative analysis using five genome-wide platforms and one proteomic platform on 3,527 specimens from 12 cancer types, revealing a unified classification into 11 major subtypes. Five subtypes were nearly identical to their tissue-of-origin counterparts, but several distinct cancer types were found to converge into common subtypes. Lung squamous, head and neck, and a subset of bladder cancers coalesced into one subtype typified by TP53 alterations, TP63 amplifications, and high expression of immune and proliferation pathway genes. Of note, bladder cancers split into three pan-cancer subtypes. The multiplatform classification, while correlated with tissue-of-origin, provides independent information for predicting clinical outcomes. All data sets are available for data-mining from a unified resource to support further biological discoveries and insights into novel therapeutic strategies.
- Published
- 2014
- Full Text
- View/download PDF
9. Simulation Studies as Designed Experiments: The Comparison of Penalized Regression Models in the “Large p, Small n” Setting.
- Author
-
Chaibub Neto, Elias, Bare, J. Christopher, and Margolin, Adam A.
- Subjects
SIMULATION methods & models ,REGRESSION analysis ,COMPUTATIONAL biology ,ALGORITHMS ,SIGNAL-to-noise ratio ,GENE expression - Abstract
New algorithms are continuously proposed in computational biology. Performance evaluation of novel methods is important in practice. Nonetheless, the field experiences a lack of rigorous methodology aimed to systematically and objectively evaluate competing approaches. Simulation studies are frequently used to show that a particular method outperforms another. Often times, however, simulation studies are not well designed, and it is hard to characterize the particular conditions under which different methods perform better. In this paper we propose the adoption of well established techniques in the design of computer and physical experiments for developing effective simulation studies. By following best practices in planning of experiments we are better able to understand the strengths and weaknesses of competing algorithms leading to more informed decisions about which method to use for a particular task. We illustrate the application of our proposed simulation framework with a detailed comparison of the ridge-regression, lasso and elastic-net algorithms in a large scale study investigating the effects on predictive performance of sample size, number of features, true model sparsity, signal-to-noise ratio, and feature correlation, in situations where the number of covariates is usually much larger than sample size. Analysis of data sets containing tens of thousands of features but only a few hundred samples is nowadays routine in computational biology, where “omics” features such as gene expression, copy number variation and sequence data are frequently used in the predictive modeling of complex phenotypes such as anticancer drug response. The penalized regression approaches investigated in this study are popular choices in this setting and our simulations corroborate well established results concerning the conditions under which each one of these methods is expected to perform best while providing several novel insights. [ABSTRACT FROM AUTHOR]
- Published
- 2014
- Full Text
- View/download PDF
10. Empirical Bayes Analysis of Quantitative Proteomics Experiments
- Author
-
Margolin, Adam A., Ong, Shao-En, Schenone, Monica, Gould, Robert, Carr, Steven A., Schreiber, Stuart L., and Golub, Todd R.
- Subjects
computational biology ,biochemistry ,drug discovery ,chemical biology ,protein chemistry and proteomics ,genetics and genomics ,bioinformatics ,mathematics ,algorithms ,statistics ,molecular biology ,translational regulation ,pharmacology ,drug development - Abstract
Background: Advances in mass spectrometry-based proteomics have enabled the incorporation of proteomic data into systems approaches to biology. However, development of analytical methods has lagged behind. Here we describe an empirical Bayes framework for quantitative proteomics data analysis. The method provides a statistical description of each experiment, including the number of proteins that differ in abundance between 2 samples, the experiment's statistical power to detect them, and the false-positive probability of each protein. Methodology/Principal Findings: We analyzed 2 types of mass spectrometric experiments. First, we showed that the method identified the protein targets of small-molecules in affinity purification experiments with high precision. Second, we re-analyzed a mass spectrometric data set designed to identify proteins regulated by microRNAs. Our results were supported by sequence analysis of the 3′ UTR regions of predicted target genes, and we found that the previously reported conclusion that a large fraction of the proteome is regulated by microRNAs was not supported by our statistical analysis of the data. Conclusions/Significance: Our results highlight the importance of rigorous statistical analysis of proteomic data, and the method described here provides a statistical framework to robustly and reliably interpret such data., Chemistry and Chemical Biology
- Published
- 2009
- Full Text
- View/download PDF
11. CANCER PANOMICS: COMPUTATIONAL METHODS AND INFRASTRUCTURE FOR INTEGRATIVE ANALYSIS OF CANCER HIGH-THROUGHPUT "OMICS" DATA.
- Author
-
BRUNAK, SØREN, DE LA VEGA, FRANCISCO M., MARGOLIN, ADAM, RAPHAEL, BENJAMIN J., RÄTSCH, GUNNAR, and STUART, JOSHUA M.
- Subjects
ONCOLOGY ,CANCER treatment ,HIGH throughput screening (Drug development) ,COMPUTATIONAL biology ,BIOINFORMATICS - Published
- 2014
12. Drug susceptibility prediction against a panel of drugs using kernelized Bayesian multitask learning.
- Author
-
Gönen, Mehmet and Margolin, Adam A.
- Subjects
- *
HIV , *COMPUTATIONAL biology , *PHARMACOGENOMICS , *HIGH throughput screening (Drug development) , *BAYESIAN analysis - Abstract
Motivation: Human immunodeficiency virus (HIV) and cancer require personalized therapies owing to their inherent heterogeneous nature. For both diseases, large-scale pharmacogenomic screens of molecularly characterized samples have been generated with the hope of identifying genetic predictors of drug susceptibility. Thus, computational algorithms capable of inferring robust predictors of drug responses from genomic information are of great practical importance. Most of the existing computational studies that consider drug susceptibility prediction against a panel of drugs formulate a separate learning problem for each drug, which cannot make use of commonalities between subsets of drugs.Results: In this study, we propose to solve the problem of drug susceptibility prediction against a panel of drugs in a multitask learning framework by formulating a novel Bayesian algorithm that combines kernel-based non-linear dimensionality reduction and binary classification (or regression). The main novelty of our method is the joint Bayesian formulation of projecting data points into a shared subspace and learning predictive models for all drugs in this subspace, which helps us to eliminate off-target effects and drug-specific experimental noise. Another novelty of our method is the ability of handling missing phenotype values owing to experimental conditions and quality control reasons. We demonstrate the performance of our algorithm via cross-validation experiments on two benchmark drug susceptibility datasets of HIV and cancer. Our method obtains statistically significantly better predictive performance on most of the drugs compared with baseline single-task algorithms that learn drug-specific models. These results show that predicting drug susceptibility against a panel of drugs simultaneously within a multitask learning framework improves overall predictive performance over single-task learning approaches.Availability and implementation: Our Matlab implementations for binary classification and regression are available at https://github.com/mehmetgonen/kbmtl.Contact: mehmet.gonen@sagebase.orgSupplementary Information: Supplementary data are available at Bioinformatics online. [ABSTRACT FROM AUTHOR]
- Published
- 2014
- Full Text
- View/download PDF
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.