852 results on '"Brenner, Steven E."'
Search Results
2. Assessing predictions on fitness effects of missense variants in HMBS in CAGI6
- Author
-
Zhang, Jing, Kinch, Lisa, Katsonis, Panagiotis, Lichtarge, Olivier, Jagota, Milind, Song, Yun S., Sun, Yuanfei, Shen, Yang, Kuru, Nurdan, Dereli, Onur, Adebali, Ogun, Alladin, Muttaqi Ahmad, Pal, Debnath, Capriotti, Emidio, Turina, Maria Paola, Savojardo, Castrense, Martelli, Pier Luigi, Babbi, Giulia, Casadio, Rita, Pucci, Fabrizio, Rooman, Marianne, Cia, Gabriel, Tsishyn, Matsvei, Strokach, Alexey, Hu, Zhiqiang, van Loggerenberg, Warren, Roth, Frederick P., Radivojac, Predrag, Brenner, Steven E., Cong, Qian, and Grishin, Nick V.
- Published
- 2024
- Full Text
- View/download PDF
3. Variant Impact Predictor database (VIPdb), version 2: trends from three decades of genetic variant impact predictors
- Author
-
Lin, Yu-Jen, Menon, Arul S., Hu, Zhiqiang, and Brenner, Steven E.
- Published
- 2024
- Full Text
- View/download PDF
4. Critical assessment of variant prioritization methods for rare disease diagnosis within the rare genomes project
- Author
-
Stenton, Sarah L., O’Leary, Melanie C., Lemire, Gabrielle, VanNoy, Grace E., DiTroia, Stephanie, Ganesh, Vijay S., Groopman, Emily, O’Heir, Emily, Mangilog, Brian, Osei-Owusu, Ikeoluwa, Pais, Lynn S., Serrano, Jillian, Singer-Berk, Moriel, Weisburd, Ben, Wilson, Michael W., Austin-Tse, Christina, Abdelhakim, Marwa, Althagafi, Azza, Babbi, Giulia, Bellazzi, Riccardo, Bovo, Samuele, Carta, Maria Giulia, Casadio, Rita, Coenen, Pieter-Jan, De Paoli, Federica, Floris, Matteo, Gajapathy, Manavalan, Hoehndorf, Robert, Jacobsen, Julius O. B., Joseph, Thomas, Kamandula, Akash, Katsonis, Panagiotis, Kint, Cyrielle, Lichtarge, Olivier, Limongelli, Ivan, Lu, Yulan, Magni, Paolo, Mamidi, Tarun Karthik Kumar, Martelli, Pier Luigi, Mulargia, Marta, Nicora, Giovanna, Nykamp, Keith, Pejaver, Vikas, Peng, Yisu, Pham, Thi Hong Cam, Podda, Maurizio S., Rao, Aditya, Rizzo, Ettore, Saipradeep, Vangala G., Savojardo, Castrense, Schols, Peter, Shen, Yang, Sivadasan, Naveen, Smedley, Damian, Soru, Dorian, Srinivasan, Rajgopal, Sun, Yuanfei, Sunderam, Uma, Tan, Wuwei, Tiwari, Naina, Wang, Xiao, Wang, Yaqiong, Williams, Amanda, Worthey, Elizabeth A., Yin, Rujie, You, Yuning, Zeiberg, Daniel, Zucca, Susanna, Bakolitsa, Constantina, Brenner, Steven E., Fullerton, Stephanie M., Radivojac, Predrag, Rehm, Heidi L., and O’Donnell-Luria, Anne
- Published
- 2024
- Full Text
- View/download PDF
5. CAGI, the Critical Assessment of Genome Interpretation, establishes progress and prospects for computational genetic variant interpretation methods
- Author
-
Jain, Shantanu, Bakolitsa, Constantina, Brenner, Steven E, Radivojac, Predrag, Moult, John, Repo, Susanna, Hoskins, Roger A, Andreoletti, Gaia, Barsky, Daniel, Chellapan, Ajithavalli, Chu, Hoyin, Dabbiru, Navya, Kollipara, Naveen K, Ly, Melissa, Neumann, Andrew J, Pal, Lipika R, Odell, Eric, Pandey, Gaurav, Peters-Petrulewicz, Robin C, Srinivasan, Rajgopal, Yee, Stephen F, Yeleswarapu, Sri Jyothsna, Zuhl, Maya, Adebali, Ogun, Patra, Ayoti, Beer, Michael A, Hosur, Raghavendra, Peng, Jian, Bernard, Brady M, Berry, Michael, Dong, Shengcheng, Boyle, Alan P, Adhikari, Aashish, Chen, Jingqi, Hu, Zhiqiang, Wang, Robert, Wang, Yaqiong, Miller, Maximilian, Wang, Yanran, Bromberg, Yana, Turina, Paola, Capriotti, Emidio, Han, James J, Ozturk, Kivilcim, Carter, Hannah, Babbi, Giulia, Bovo, Samuele, Di Lena, Pietro, Martelli, Pier Luigi, Savojardo, Castrense, Casadio, Rita, Cline, Melissa S, De Baets, Greet, Bonache, Sandra, Díez, Orland, Gutiérrez-Enríquez, Sara, Fernández, Alejandro, Montalban, Gemma, Ootes, Lars, Özkan, Selen, Padilla, Natàlia, Riera, Casandra, De la Cruz, Xavier, Diekhans, Mark, Huwe, Peter J, Wei, Qiong, Xu, Qifang, Dunbrack, Roland L, Gotea, Valer, Elnitski, Laura, Margolin, Gennady, Fariselli, Piero, Kulakovskiy, Ivan V, Makeev, Vsevolod J, Penzar, Dmitry D, Vorontsov, Ilya E, Favorov, Alexander V, Forman, Julia R, Hasenahuer, Marcia, Fornasari, Maria S, Parisi, Gustavo, Avsec, Ziga, Çelik, Muhammed H, Nguyen, Thi Yen Duong, Gagneur, Julien, Shi, Fang-Yuan, Edwards, Matthew D, Guo, Yuchun, Tian, Kevin, Zeng, Haoyang, Gifford, David K, Göke, Jonathan, Zaucha, Jan, Gough, Julian, Ritchie, Graham RS, Frankish, Adam, Mudge, Jonathan M, Harrow, Jennifer, Young, Erin L, and Yu, Yao
- Subjects
Biological Sciences ,Genetics ,Human Genome ,2.1 Biological and endogenous factors ,Good Health and Well Being ,Humans ,Computational Biology ,Mutation ,Missense ,Phenotype ,Critical Assessment of Genome Interpretation Consortium ,Environmental Sciences ,Information and Computing Sciences ,Bioinformatics - Abstract
BackgroundThe Critical Assessment of Genome Interpretation (CAGI) aims to advance the state-of-the-art for computational prediction of genetic variant impact, particularly where relevant to disease. The five complete editions of the CAGI community experiment comprised 50 challenges, in which participants made blind predictions of phenotypes from genetic data, and these were evaluated by independent assessors.ResultsPerformance was particularly strong for clinical pathogenic variants, including some difficult-to-diagnose cases, and extends to interpretation of cancer-related variants. Missense variant interpretation methods were able to estimate biochemical effects with increasing accuracy. Assessment of methods for regulatory variants and complex trait disease risk was less definitive and indicates performance potentially suitable for auxiliary use in the clinic.ConclusionsResults show that while current methods are imperfect, they have major utility for research and clinical applications. Emerging methods and increasingly large, robust datasets for training and assessment promise further progress ahead.
- Published
- 2024
6. Newborn screening for neurodevelopmental diseases: Are we there yet?
- Author
-
Chung, Wendy K, Berg, Jonathan S, Botkin, Jeffrey R, Brenner, Steven E, Brosco, Jeffrey P, Brothers, Kyle B, Currier, Robert J, Gaviglio, Amy, Kowtoniuk, Walter E, Olson, Colleen, Lloyd‐Puryear, Michele, Saarinen, Annamarie, Sahin, Mustafa, Shen, Yufeng, Sherr, Elliott H, Watson, Michael S, and Hu, Zhanzhi
- Subjects
Biological Sciences ,Biomedical and Clinical Sciences ,Genetics ,Clinical Research ,Brain Disorders ,Infant Mortality ,Prevention ,Intellectual and Developmental Disabilities (IDD) ,Pediatric Research Initiative ,Perinatal Period - Conditions Originating in Perinatal Period ,Pediatric ,Reproductive health and childbirth ,Good Health and Well Being ,Infant ,Infant ,Newborn ,Humans ,Neonatal Screening ,Pilot Projects ,Neurodevelopmental Disorders ,Parents ,Clinical Sciences ,Genetics & Heredity ,Clinical sciences - Abstract
In the US, newborn screening (NBS) is a unique health program that supports health equity and screens virtually every baby after birth, and has brought timely treatments to babies since the 1960's. With the decreasing cost of sequencing and the improving methods to interpret genetic data, there is an opportunity to add DNA sequencing as a screening method to facilitate the identification of babies with treatable conditions that cannot be identified in any other scalable way, including highly penetrant genetic neurodevelopmental disorders (NDD). However, the lack of effective dietary or drug-based treatments has made it nearly impossible to consider NDDs in the current NBS framework, yet it is anticipated that any treatment will be maximally effective if started early. Hence there is a critical need for large scale pilot studies to assess if and how NDDs can be effectively screened at birth, if parents desire that information, and what impact early diagnosis may have. Here we attempt to provide an overview of the recent advances in NDD treatments, explore the possible framework of setting up a pilot study to genetically screen for NDDs, highlight key technical, practical, and ethical considerations and challenges, and examine the policy and health system implications.
- Published
- 2022
7. SCOPe: improvements to the structural classification of proteins – extended database to facilitate variant interpretation and machine learning
- Author
-
Chandonia, John-Marc, Guan, Lindsey, Lin, Shiangyi, Yu, Changhua, Fox, Naomi K, and Brenner, Steven E
- Subjects
Biochemistry and Cell Biology ,Bioinformatics and Computational Biology ,Biological Sciences ,Machine Learning and Artificial Intelligence ,Biotechnology ,Algorithms ,Computational Biology ,Databases ,Chemical ,Databases ,Protein ,Gene Expression Regulation ,Machine Learning ,Proteins ,Environmental Sciences ,Information and Computing Sciences ,Developmental Biology ,Biological sciences ,Chemical sciences ,Environmental sciences - Abstract
The Structural Classification of Proteins-extended (SCOPe, https://scop.berkeley.edu) knowledgebase aims to provide an accurate, detailed, and comprehensive description of the structural and evolutionary relationships amongst the majority of proteins of known structure, along with resources for analyzing the protein structures and their sequences. Structures from the PDB are divided into domains and classified using a combination of manual curation and highly precise automated methods. In the current release of SCOPe, 2.08, we have developed search and display tools for analysis of genetic variants we mapped to structures classified in SCOPe. In order to improve the utility of SCOPe to automated methods such as deep learning classifiers that rely on multiple alignment of sequences of homologous proteins, we have introduced new machine-parseable annotations that indicate aberrant structures as well as domains that are distinguished by a smaller repeat unit. We also classified structures from 74 of the largest Pfam families not previously classified in SCOPe, and we improved our algorithm to remove N- and C-terminal cloning, expression and purification sequences from SCOPe domains. SCOPe 2.08-stable classifies 106 976 PDB entries (about 60% of PDB entries).
- Published
- 2022
8. Investigation of the causal etiology in a patient with T-B+NK+ immunodeficiency
- Author
-
Sertori, Robert, Lin, Jian-Xin, Martinez, Esteban, Rana, Sadhna, Sharo, Andrew, Kazemian, Majid, Sunderam, Uma, Andrake, Mark, Shinton, Susan, Truong, Billy, Dunbrack, Roland M, Liu, Chengyu, Srinivasan, Rajgopol, Brenner, Steven E, Seroogy, Christine M, Puck, Jennifer M, Leonard, Warren J, and Wiest, David L
- Subjects
Biological Sciences ,Biomedical and Clinical Sciences ,Cardiovascular Medicine and Haematology ,Immunology ,Stem Cell Research ,Pediatric ,Rare Diseases ,Transplantation ,Genetics ,2.1 Biological and endogenous factors ,Animals ,Humans ,Infant ,Infant ,Newborn ,Lymphopenia ,Male ,Mice ,Neonatal Screening ,Severe Combined Immunodeficiency ,T-Lymphocytes ,Zebrafish ,immunodeficiency ,newborn screening ,zebrafish ,thymus ,MED14 ,T cell lymphopenia ,severe combined immunodeficiency ,Medical Microbiology ,Biochemistry and cell biology - Abstract
Newborn screening for severe combined immunodeficiency (SCID) has not only accelerated diagnosis and improved treatment for affected infants, but also led to identification of novel genes required for human T cell development. A male proband had SCID newborn screening showing very low T cell receptor excision circles (TRECs), a biomarker for thymic output of nascent T cells. He had persistent profound T lymphopenia, but normal numbers of B and natural killer (NK) cells. Despite an allogeneic hematopoietic stem cell transplant from his brother, he failed to develop normal T cells. Targeted resequencing excluded known SCID genes; however, whole exome sequencing (WES) of the proband and parents revealed a maternally inherited X-linked missense mutation in MED14 (MED14V763A), a component of the mediator complex. Morpholino (MO)-mediated loss of MED14 function attenuated T cell development in zebrafish. Moreover, this arrest was rescued by ectopic expression of cDNA encoding the wild type human MED14 ortholog, but not by MED14V763A , suggesting that the variant impaired MED14 function. Modeling of the equivalent mutation in mouse (Med14V769A) did not disrupt T cell development at baseline. However, repopulation of peripheral T cells upon competitive bone marrow transplantation was compromised, consistent with the incomplete T cell reconstitution experienced by the proband upon transplantation with bone marrow from his healthy male sibling, who was found to have the same MED14V763A variant. Suspecting that the variable phenotypic expression between the siblings was influenced by further mutation(s), we sought to identify genetic variants present only in the affected proband. Indeed, WES revealed a mutation in the L1 cell adhesion molecule (L1CAMQ498H); however, introducing that mutation in vivo in mice did not disrupt T cell development. Consequently, immunodeficiency in the proband may depend upon additional, unidentified gene variants.
- Published
- 2022
9. ClinVar and HGMD genomic variant classification accuracy has improved over time, as measured by implied disease burden
- Author
-
Sharo, Andrew G., Zou, Yangyun, Adhikari, Aashish N., and Brenner, Steven E.
- Published
- 2023
- Full Text
- View/download PDF
10. Application of full-genome analysis to diagnose rare monogenic disorders.
- Author
-
Shieh, Joseph T, Penon-Portmann, Monica, Wong, Karen HY, Levy-Sakin, Michal, Verghese, Michelle, Slavotinek, Anne, Gallagher, Renata C, Mendelsohn, Bryce A, Tenney, Jessica, Beleford, Daniah, Perry, Hazel, Chow, Stephen K, Sharo, Andrew G, Brenner, Steven E, Qi, Zhongxia, Yu, Jingwei, Klein, Ophir D, Martin, David, Kwok, Pui-Yan, and Boffelli, Dario
- Subjects
Human Genome ,Genetics ,Genetic Testing ,Prevention ,4.1 Discovery and preclinical testing of markers and technologies ,4.2 Evaluation of markers and technologies - Abstract
Current genetic testenhancer and narrows the diagnostic intervals for rare diseases provide a diagnosis in only a modest proportion of cases. The Full-Genome Analysis method, FGA, combines long-range assembly and whole-genome sequencing to detect small variants, structural variants with breakpoint resolution, and phasing. We built a variant prioritization pipeline and tested FGA's utility for diagnosis of rare diseases in a clinical setting. FGA identified structural variants and small variants with an overall diagnostic yield of 40% (20 of 50 cases) and 35% in exome-negative cases (8 of 23 cases), 4 of these were structural variants. FGA detected and mapped structural variants that are missed by short reads, including non-coding duplication, and phased variants across long distances of more than 180 kb. With the prioritization algorithm, longer DNA technologies could replace multiple tests for monogenic disorders and expand the range of variants detected. Our study suggests that genomes produced from technologies like FGA can improve variant detection and provide higher resolution genome maps for future application.
- Published
- 2021
11. Revealing molecular pathways for cancer cell fitness through a genetic screen of the cancer translatome
- Author
-
Kuzuoglu-Ozturk, Duygu, Hu, Zhiqiang, Rama, Martina, Devericks, Emily, Weiss, Jacob, Chiang, Gary G, Worland, Stephen T, Brenner, Steven E, Goodarzi, Hani, Gilbert, Luke A, and Ruggero, Davide
- Subjects
Biochemistry and Cell Biology ,Genetics ,Biological Sciences ,Cancer ,1.1 Normal biological development and functioning ,Aetiology ,Underpinning research ,2.1 Biological and endogenous factors ,Generic health relevance ,5' Untranslated Regions ,Animals ,Apoptosis ,Autophagy ,Basic Helix-Loop-Helix Leucine Zipper Transcription Factors ,CRISPR-Cas Systems ,Cell Line ,Tumor ,Cell Movement ,Cell Proliferation ,Eukaryotic Initiation Factor-4E ,Exons ,Genetic Testing ,Genome ,Human ,Humans ,Male ,Metalloendopeptidases ,Mice ,Mitochondria ,Mitochondrial Proteins ,Neoplasms ,Peptide Hydrolases ,Protein Biosynthesis ,Signal Transduction ,Stress ,Physiological ,bcl-X Protein ,Bcl-xL ,CRISPRi ,EJC ,Tfeb ,UPR(mt)-like stress response ,autophagy ,cancer ,eIF4E ,mitochondria ,translation control ,Pmpcb ,mitochondrial UPR ,Medical Physiology ,Biological sciences - Abstract
The major cap-binding protein eukaryotic translation initiation factor 4E (eIF4E), an ancient protein required for translation of all eukaryotic genomes, is a surprising yet potent oncogenic driver. The genetic interactions that maintain the oncogenic activity of this key translation factor remain unknown. In this study, we carry out a genome-wide CRISPRi screen wherein we identify more than 600 genetic interactions that sustain eIF4E oncogenic activity. Our data show that eIF4E controls the translation of Tfeb, a key executer of the autophagy response. This autophagy survival response is triggered by mitochondrial proteotoxic stress, which allows cancer cell survival. Our screen also reveals a functional interaction between eIF4E and a single anti-apoptotic factor, Bcl-xL, in tumor growth. Furthermore, we show that eIF4E and the exon-junction complex (EJC), which is involved in many steps of RNA metabolism, interact to control the migratory properties of cancer cells. Overall, we uncover several cancer-specific vulnerabilities that provide further resolution of the cancer translatome.
- Published
- 2021
12. Opportunities and challenges for the computational interpretation of rare variation in clinically important genes
- Author
-
McInnes, Gregory, Sharo, Andrew G, Koleske, Megan L, Brown, Julia EH, Norstad, Matthew, Adhikari, Aashish N, Wang, Sheng, Brenner, Steven E, Halpern, Jodi, Koenig, Barbara A, Magnus, David C, Gallagher, Renata C, Giacomini, Kathleen M, and Altman, Russ B
- Subjects
Pharmacology and Pharmaceutical Sciences ,Biological Sciences ,Biomedical and Clinical Sciences ,Genetics ,Machine Learning and Artificial Intelligence ,Cancer ,Biotechnology ,Precision Medicine ,Clinical Research ,Human Genome ,Patient Safety ,Cancer Genomics ,4.2 Evaluation of markers and technologies ,Generic health relevance ,Good Health and Well Being ,Genetic Variation ,Genetics ,Medical ,Genome ,Human ,Genomics ,Humans ,Infant ,Newborn ,Machine Learning ,Metabolism ,Inborn Errors ,Pharmacogenetics ,Medical and Health Sciences ,Genetics & Heredity ,Biological sciences ,Biomedical and clinical sciences ,Health sciences - Abstract
Genome sequencing is enabling precision medicine-tailoring treatment to the unique constellation of variants in an individual's genome. The impact of recurrent pathogenic variants is often understood, however there is a long tail of rare genetic variants that are uncharacterized. The problem of uncharacterized rare variation is especially acute when it occurs in genes of known clinical importance with functionally consequential variants and associated mechanisms. Variants of uncertain significance (VUSs) in these genes are discovered at a rate that outpaces current ability to classify them with databases of previous cases, experimental evaluation, and computational predictors. Clinicians are thus left without guidance about the significance of variants that may have actionable consequences. Computational prediction of the impact of rare genetic variation is increasingly becoming an important capability. In this paper, we review the technical and ethical challenges of interpreting the function of rare variants in two settings: inborn errors of metabolism in newborns and pharmacogenomics. We propose a framework for a genomic learning healthcare system with an initial focus on early-onset treatable disease in newborns and actionable pharmacogenomics. We argue that (1) a genomic learning healthcare system must allow for continuous collection and assessment of rare variants, (2) emerging machine learning methods will enable algorithms to predict the clinical impact of rare variants on protein function, and (3) ethical considerations must inform the construction and deployment of all rare-variation triage strategies, particularly with respect to health disparities arising from unbalanced ancestry representation.
- Published
- 2021
13. The role of exome sequencing in newborn screening for inborn errors of metabolism
- Author
-
Adhikari, Aashish N, Gallagher, Renata C, Wang, Yaqiong, Currier, Robert J, Amatuni, George, Bassaganyas, Laia, Chen, Flavia, Kundu, Kunal, Kvale, Mark, Mooney, Sean D, Nussbaum, Robert L, Randi, Savanna S, Sanford, Jeremy, Shieh, Joseph T, Srinivasan, Rajgopal, Sunderam, Uma, Tang, Hao, Vaka, Dedeepya, Zou, Yangyun, Koenig, Barbara A, Kwok, Pui-Yan, Risch, Neil, Puck, Jennifer M, and Brenner, Steven E
- Subjects
Paediatrics ,Biomedical and Clinical Sciences ,Pediatric ,Perinatal Period - Conditions Originating in Perinatal Period ,Genetics ,Infant Mortality ,Preterm ,Low Birth Weight and Health of the Newborn ,Rare Diseases ,Prevention ,Health Services ,Clinical Research ,Genetic Testing ,4.4 Population screening ,Neurological ,Exome ,Humans ,Infant ,Newborn ,Metabolism ,Inborn Errors ,Neonatal Screening ,Tandem Mass Spectrometry ,Exome Sequencing ,Medical and Health Sciences ,Immunology ,Biomedical and clinical sciences ,Health sciences - Abstract
Public health newborn screening (NBS) programs provide population-scale ascertainment of rare, treatable conditions that require urgent intervention. Tandem mass spectrometry (MS/MS) is currently used to screen newborns for a panel of rare inborn errors of metabolism (IEMs)1-4. The NBSeq project evaluated whole-exome sequencing (WES) as an innovative methodology for NBS. We obtained archived residual dried blood spots and data for nearly all IEM cases from the 4.5 million infants born in California between mid-2005 and 2013 and from some infants who screened positive by MS/MS, but were unaffected upon follow-up testing. WES had an overall sensitivity of 88% and specificity of 98.4%, compared to 99.0% and 99.8%, respectively for MS/MS, although effectiveness varied among individual IEMs. Thus, WES alone was insufficiently sensitive or specific to be a primary screen for most NBS IEMs. However, as a secondary test for infants with abnormal MS/MS screens, WES could reduce false-positive results, facilitate timely case resolution and in some instances even suggest more appropriate or specific diagnosis than that initially obtained. This study represents the largest, to date, sequencing effort of an entire population of IEM-affected cases, allowing unbiased assessment of current capabilities of WES as a tool for population screening.
- Published
- 2020
14. Genomic Analysis of Historical Cases with Positive Newborn Screens for Short-Chain Acyl-CoA Dehydrogenase Deficiency Shows That a Validated Second-Tier Biochemical Test Can Replace Future Sequencing
- Author
-
Adhikari, Aashish N, Currier, Robert J, Tang, Hao, Turgeon, Coleman T, Nussbaum, Robert L, Srinivasan, Rajgopal, Sunderam, Uma, Kwok, Pui-Yan, Brenner, Steven E, Gavrilov, Dimitar, Puck, Jennifer M, and Gallagher, Renata
- Subjects
Biological Sciences ,Biomedical and Clinical Sciences ,Genetics ,Infant Mortality ,Genetic Testing ,Rare Diseases ,Clinical Research ,Perinatal Period - Conditions Originating in Perinatal Period ,Pediatric ,Prevention ,Biotechnology ,2.1 Biological and endogenous factors ,4.2 Evaluation of markers and technologies ,newborn screening ,short-chain acyl-CoA dehydrogenase deficiency ,SCADD ,ACADS ,second-tier testing ,ethylmalonic acid ,exome sequence ,butyrylcarnitine - Abstract
Short-chain acyl-CoA dehydrogenase deficiency (SCADD) is a rare autosomal recessive disorder of β-oxidation caused by pathogenic variants in the ACADS gene. Analyte testing for SCADD in blood and urine, including newborn screening (NBS) using tandem mass spectrometry (MS/MS) on dried blood spots (DBSs), is complicated by the presence of two relatively common ACADS variants (c.625G>A and c.511C>T). Individuals homozygous for these variants or compound heterozygous do not have clinical disease but do have reduced short-chain acyl-CoA dehydrogenase (SCAD) activity, resulting in elevated blood and urine metabolites. As part of a larger study of the potential role of exome sequencing in NBS in California, we reviewed ACADS sequence and MS/MS data from DBSs from a cohort of 74 patients identified to have SCADD. Of this cohort, approximately 60% had one or more of the common variants and did not have the two rare variants, and thus would need no further testing. Retrospective analysis of ethylmalonic acid, glutaric acid, 2-hydroxyglutaric acid, 3-hydroxyglutaric acid, and methylsuccinic acid demonstrated that second-tier testing applied before the release of the newborn screening result could reduce referrals by over 50% and improve the positive predictive value for SCADD to above 75%.
- Published
- 2020
15. Using the ACMG/AMP framework to capture evidence related to predicted and observed impact on splicing: Recommendations from the ClinGen SVI Splicing Subgroup
- Author
-
Biesecker, Leslie G., Harrison, Steven M., Tayoun, Ahmad A., Berg, Jonathan S., Brenner, Steven E., Cutting, Garry R., Ellard, Sian, Greenblatt, Marc S., Kang, Peter, Karbassi, Izabela, Karchin, Rachel, Mester, Jessica, O’Donnell-Luria, Anne, Pesaran, Tina, Plon, Sharon E., Rehm, Heidi L., Strande, Natasha T., Tavtigian, Sean V., Topper, Scott, Walker, Logan C., Hoya, Miguel de la, Wiggins, George A.R., Lindy, Amanda, Vincent, Lisa M., Parsons, Michael T., Canson, Daffodil M., Bis-Brewer, Dana, Cass, Ashley, Tchourbanov, Alexander, Zimmermann, Heather, Byrne, Alicia B., Karam, Rachid, and Spurdle, Amanda B.
- Published
- 2023
- Full Text
- View/download PDF
16. Performance of computational methods for the evaluation of pericentriolar material 1 missense variants in CAGI‐5
- Author
-
Monzon, Alexander Miguel, Carraro, Marco, Chiricosta, Luigi, Reggiani, Francesco, Han, James, Ozturk, Kivilcim, Wang, Yanran, Miller, Maximilian, Bromberg, Yana, Capriotti, Emidio, Savojardo, Castrense, Babbi, Giulia, Martelli, Pier L, Casadio, Rita, Katsonis, Panagiotis, Lichtarge, Olivier, Carter, Hannah, Kousi, Maria, Katsanis, Nicholas, Andreoletti, Gaia, Moult, John, Brenner, Steven E, Ferrari, Carlo, Leonardi, Emanuela, and Tosatto, Silvio CE
- Subjects
Biological Sciences ,Bioinformatics and Computational Biology ,Schizophrenia ,Genetics ,Brain Disorders ,Mental Health ,Aetiology ,2.1 Biological and endogenous factors ,Mental health ,Autoantigens ,Cell Cycle Proteins ,Computational Biology ,Databases ,Genetic ,Genetic Predisposition to Disease ,Humans ,Mutation ,Missense ,Neural Networks ,Computer ,Phenotype ,Polymorphism ,Single Nucleotide ,bioinformatics tools ,community challenge ,critical assessment ,effect prediction ,missense mutations ,variant interpretation ,Clinical Sciences ,Genetics & Heredity ,Clinical sciences - Abstract
The CAGI-5 pericentriolar material 1 (PCM1) challenge aimed to predict the effect of 38 transgenic human missense mutations in the PCM1 protein implicated in schizophrenia. Participants were provided with 16 benign variants (negative controls), 10 hypomorphic, and 12 loss of function variants. Six groups participated and were asked to predict the probability of effect and standard deviation associated to each mutation. Here, we present the challenge assessment. Prediction performance was evaluated using different measures to conclude in a final ranking which highlights the strengths and weaknesses of each group. The results show a great variety of predictions where some methods performed significantly better than others. Benign variants played an important role as negative controls, highlighting predictors biased to identify disease phenotypes. The best predictor, Bromberg lab, used a neural-network-based method able to discriminate between neutral and non-neutral single nucleotide polymorphisms. The CAGI-5 PCM1 challenge allowed us to evaluate the state of the art techniques for interpreting the effect of novel variants for a difficult target protein.
- Published
- 2019
17. Assessing computational predictions of the phenotypic effect of cystathionine‐beta‐synthase variants
- Author
-
Kasak, Laura, Bakolitsa, Constantina, Hu, Zhiqiang, Yu, Changhua, Rine, Jasper, Dimster‐Denk, Dago F, Pandey, Gaurav, Baets, Greet, Bromberg, Yana, Cao, Chen, Capriotti, Emidio, Casadio, Rita, Durme, Joost, Giollo, Manuel, Karchin, Rachel, Katsonis, Panagiotis, Leonardi, Emanuela, Lichtarge, Olivier, Martelli, Pier Luigi, Masica, David, Mooney, Sean D, Olatubosun, Ayodeji, Radivojac, Predrag, Rousseau, Frederic, Pal, Lipika R, Savojardo, Castrense, Schymkowitz, Joost, Thusberg, Janita, Tosatto, Silvio CE, Vihinen, Mauno, Väliaho, Jouni, Repo, Susanna, Moult, John, Brenner, Steven E, and Friedberg, Iddo
- Subjects
Biological Sciences ,Bioinformatics and Computational Biology ,Genetics ,Networking and Information Technology R&D (NITRD) ,Aetiology ,2.1 Biological and endogenous factors ,Generic health relevance ,Good Health and Well Being ,Amino Acid Substitution ,Computational Biology ,Cystathionine ,Cystathionine beta-Synthase ,Homocysteine ,Humans ,Phenotype ,Precision Medicine ,CAGI challenge ,critical assessment ,cystathionine-beta-synthase ,machine learning ,phenotype prediction ,single amino acid substitution ,Clinical Sciences ,Genetics & Heredity ,Clinical sciences - Abstract
Accurate prediction of the impact of genomic variation on phenotype is a major goal of computational biology and an important contributor to personalized medicine. Computational predictions can lead to a better understanding of the mechanisms underlying genetic diseases, including cancer, but their adoption requires thorough and unbiased assessment. Cystathionine-beta-synthase (CBS) is an enzyme that catalyzes the first step of the transsulfuration pathway, from homocysteine to cystathionine, and in which variations are associated with human hyperhomocysteinemia and homocystinuria. We have created a computational challenge under the CAGI framework to evaluate how well different methods can predict the phenotypic effect(s) of CBS single amino acid substitutions using a blinded experimental data set. CAGI participants were asked to predict yeast growth based on the identity of the mutations. The performance of the methods was evaluated using several metrics. The CBS challenge highlighted the difficulty of predicting the phenotype of an ex vivo system in a model organism when classification models were trained on human disease data. We also discuss the variations in difficulty of prediction for known benign and deleterious variants, as well as identify methodological and experimental constraints with lessons to be learned for future challenges.
- Published
- 2019
18. Calibration of computational tools for missense variant pathogenicity classification and ClinGen recommendations for PP3/BP4 criteria
- Author
-
Biesecker, Leslie G., Harrison, Steven M., Tayoun, Ahmad A., Berg, Jonathan S., Brenner, Steven E., Cutting, Garry R., Ellard, Sian, Greenblatt, Marc S., Kang, Peter, Karbassi, Izabela, Karchin, Rachel, Mester, Jessica, O’Donnell-Luria, Anne, Pesaran, Tina, Plon, Sharon E., Rehm, Heidi L., Strande, Natasha T., Tavtigian, Sean V., Topper, Scott, Pejaver, Vikas, Byrne, Alicia B., Feng, Bing-Jian, Pagel, Kymberleigh A., Mooney, Sean D., and Radivojac, Predrag
- Published
- 2022
- Full Text
- View/download PDF
19. StrVCTVRE: A supervised learning method to predict the pathogenicity of human genome structural variants
- Author
-
Sharo, Andrew G., Hu, Zhiqiang, Sunyaev, Shamil R., and Brenner, Steven E.
- Published
- 2022
- Full Text
- View/download PDF
20. SCOPe: classification of large macromolecular structures in the structural classification of proteins—extended database
- Author
-
Chandonia, John-Marc, Fox, Naomi K, and Brenner, Steven E
- Subjects
Biochemistry and Cell Biology ,Bioinformatics and Computational Biology ,Biological Sciences ,Generic health relevance ,Databases ,Protein ,Multiprotein Complexes ,Proteasome Endopeptidase Complex ,Protein Domains ,Spliceosomes ,Environmental Sciences ,Information and Computing Sciences ,Developmental Biology ,Biological sciences ,Chemical sciences ,Environmental sciences - Abstract
The SCOPe (Structural Classification of Proteins-extended, https://scop.berkeley.edu) database hierarchically classifies domains from the majority of proteins of known structure according to their structural and evolutionary relationships. SCOPe also incorporates and updates the ASTRAL compendium, which provides multiple databases and tools to aid in the analysis of the sequences and structures of proteins classified in SCOPe. Protein structures are classified using a combination of manual curation and highly precise automated methods. In the current release of SCOPe, 2.07, we have focused our manual curation efforts on larger protein structures, including the spliceosome, proteasome and RNA polymerase I, as well as many other Pfam families that had not previously been classified. Domains from these large protein complexes are distinctive in several ways: novel non-globular folds are more common, and domains from previously observed protein families often have N- or C-terminal extensions that were disordered or not present in previous structures. The current monthly release update, SCOPe 2.07-2018-10-18, classifies 90 992 PDB entries (about two thirds of PDB entries).
- Published
- 2019
21. Registered access: authorizing data access
- Author
-
Dyke, Stephanie OM, Linden, Mikael, Lappalainen, Ilkka, De Argila, Jordi Rambla, Carey, Knox, Lloyd, David, Spalding, J Dylan, Cabili, Moran N, Kerry, Giselle, Foreman, Julia, Cutts, Tim, Shabani, Mahsa, Rodriguez, Laura L, Haeussler, Maximilian, Walsh, Brian, Jiang, Xiaoqian, Wang, Shuang, Perrett, Daniel, Boughtwood, Tiffany, Matern, Andreas, Brookes, Anthony J, Cupak, Miro, Fiume, Marc, Pandya, Ravi, Tulchinsky, Ilia, Scollen, Serena, Törnroos, Juha, Das, Samir, Evans, Alan C, Malin, Bradley A, Beck, Stephan, Brenner, Steven E, Nyrönen, Tommi, Blomberg, Niklas, Firth, Helen V, Hurles, Matthew, Philippakis, Anthony A, Rätsch, Gunnar, Brudno, Michael, Boycott, Kym M, Rehm, Heidi L, Baudis, Michael, Sherry, Stephen T, Kato, Kazuto, Knoppers, Bartha M, Baker, Dixie, and Flicek, Paul
- Subjects
Biological Sciences ,Biomedical and Clinical Sciences ,Clinical Sciences ,Genetics ,8.3 Policy ,ethics ,and research governance ,Health and social care services research ,Generic health relevance ,Good Health and Well Being ,Access to Information ,Genetics ,Medical ,Genomics ,Humans ,Information Dissemination ,Licensure ,Practice Guidelines as Topic ,Genetics & Heredity ,Clinical sciences - Abstract
The Global Alliance for Genomics and Health (GA4GH) proposes a data access policy model-"registered access"-to increase and improve access to data requiring an agreement to basic terms and conditions, such as the use of DNA sequence and health data in research. A registered access policy would enable a range of categories of users to gain access, starting with researchers and clinical care professionals. It would also facilitate general use and reuse of data but within the bounds of consent restrictions and other ethical obligations. In piloting registered access with the Scientific Demonstration data sharing projects of GA4GH, we provide additional ethics, policy and technical guidance to facilitate the implementation of this access model in an international setting.
- Published
- 2018
22. KBase: The United States Department of Energy Systems Biology Knowledgebase
- Author
-
Arkin, Adam P, Cottingham, Robert W, Henry, Christopher S, Harris, Nomi L, Stevens, Rick L, Maslov, Sergei, Dehal, Paramvir, Ware, Doreen, Perez, Fernando, Canon, Shane, Sneddon, Michael W, Henderson, Matthew L, Riehl, William J, Murphy-Olson, Dan, Chan, Stephen Y, Kamimura, Roy T, Kumari, Sunita, Drake, Meghan M, Brettin, Thomas S, Glass, Elizabeth M, Chivian, Dylan, Gunter, Dan, Weston, David J, Allen, Benjamin H, Baumohl, Jason, Best, Aaron A, Bowen, Ben, Brenner, Steven E, Bun, Christopher C, Chandonia, John-Marc, Chia, Jer-Ming, Colasanti, Ric, Conrad, Neal, Davis, James J, Davison, Brian H, DeJongh, Matthew, Devoid, Scott, Dietrich, Emily, Dubchak, Inna, Edirisinghe, Janaka N, Fang, Gang, Faria, José P, Frybarger, Paul M, Gerlach, Wolfgang, Gerstein, Mark, Greiner, Annette, Gurtowski, James, Haun, Holly L, He, Fei, Jain, Rashmi, Joachimiak, Marcin P, Keegan, Kevin P, Kondo, Shinnosuke, Kumar, Vivek, Land, Miriam L, Meyer, Folker, Mills, Marissa, Novichkov, Pavel S, Oh, Taeyun, Olsen, Gary J, Olson, Robert, Parrello, Bruce, Pasternak, Shiran, Pearson, Erik, Poon, Sarah S, Price, Gavin A, Ramakrishnan, Srividya, Ranjan, Priya, Ronald, Pamela C, Schatz, Michael C, Seaver, Samuel MD, Shukla, Maulik, Sutormin, Roman A, Syed, Mustafa H, Thomason, James, Tintle, Nathan L, Wang, Daifeng, Xia, Fangfang, Yoo, Hyunseung, Yoo, Shinjae, and Yu, Dantong
- Subjects
Computational Biology ,Database Management Systems ,Humans ,Knowledge Bases ,Systems Biology ,United States - Published
- 2018
23. A novel PRRT2 pathogenic variant in a family with paroxysmal kinesigenic dyskinesia and benign familial infantile seizures.
- Author
-
Lu, Jacqueline G, Bishop, Juliet, Cheyette, Sarah, Zhulin, Igor B, Guo, Su, Sobreira, Nara, and Brenner, Steven E
- Subjects
Humans ,Epilepsy ,Benign Neonatal ,Dystonia ,Membrane Proteins ,Nerve Tissue Proteins ,Pedigree ,Family ,Evolution ,Molecular ,Chromosome Segregation ,Amino Acid Sequence ,Base Sequence ,Conserved Sequence ,Inheritance Patterns ,Mutation ,Child ,Infant ,Female ,Male ,ataxia ,autism ,extrapyramidal dyskinesia ,focal autonomic seizures without altered responsiveness ,infantile spasms ,intellectual disability ,profound ,migraine without aura ,paroxysmal dyskinesia ,Epilepsy ,Benign Neonatal ,Evolution ,Molecular ,intellectual disability ,profound - Abstract
Paroxysmal kinesigenic dyskinesia (PKD) is a rare neurological disorder characterized by recurrent attacks of dyskinetic movements without alteration of consciousness that are often triggered by the initiation of voluntary movements. Whole-exome sequencing has revealed a cluster of pathogenic variants in PRRT2 (proline-rich transmembrane protein), a gene with a function in synaptic regulation that remains poorly understood. Here, we report the discovery of a novel PRRT2 pathogenic variant inherited in an autosomal dominant pattern in a family with PKD and benign familial infantile seizures (BFIS). After targeted Sanger sequencing did not identify the presence of previously described PRRT2 pathogenic variants, we carried out whole-exome sequencing in the proband and her affected paternal grandfather. This led to the discovery of a novel PRRT2 variant, NM_001256442:exon3:c.C959T/NP_660282.2:p.A320V, altering an evolutionarily conserved alanine at the amino acid position 320 located in the M2 transmembrane region. Sanger sequencing further confirmed the presence of this variant in four affected family members (paternal grandfather, father, brother, and proband) and its absence in two unaffected ones (paternal grandmother and mother). This newly found variant further reinforces the importance of PRRT2 in PKD, BFIS, and possibly other movement disorders. Future functional studies using animal models and human pluripotent stem cell models will provide new insights into the role of PRRT2 and the significance of this variant in regulating neural development and/or function.
- Published
- 2018
24. Critical assessment of missense variant effect predictors on disease-relevant variant data
- Author
-
Rastogi, Ruchir, primary, Chung, Ryan, additional, Li, Sindy, additional, Li, Chang, additional, Lee, Kyoungyeul, additional, Woo, Junwoo, additional, Kim, Dong-Wook, additional, Keum, Changwon, additional, Babbi, Giulia, additional, Martelli, Pier Luigi, additional, Savojardo, Castrense, additional, Casadio, Rita, additional, Chennen, Kirsley, additional, Weber, Thomas, additional, Poch, Olivier, additional, Ancien, Francois, additional, Cia, Gabriel, additional, Pucci, Fabrizio, additional, Raimondi, Daniele, additional, Vranken, Wim, additional, Rooman, Marianne, additional, Marquet, Celine, additional, Olenyi, Tobias, additional, Rost, Burkhard, additional, Andreoletti, Gaia, additional, Kamandula, Akash, additional, Peng, Yisu, additional, Bakolitsa, Constantina, additional, Mort, Matthew, additional, Cooper, David N., additional, Bergquist, Timothy, additional, Pejaver, Vikas, additional, Liu, Xiaoming, additional, Radivojac, Predrag, additional, Brenner, Steven E., additional, and Ioannidis, Nilah M., additional
- Published
- 2024
- Full Text
- View/download PDF
25. Matching phenotypes to whole genomes: Lessons learned from four iterations of the personal genome project community challenges
- Author
-
Cai, Binghuang, Li, Biao, Kiga, Nikki, Thusberg, Janita, Bergquist, Timothy, Chen, Yun‐Ching, Niknafs, Noushin, Carter, Hannah, Tokheim, Collin, Beleva‐Guthrie, Violeta, Douville, Christopher, Bhattacharya, Rohit, Yeo, Hui Ting Grace, Fan, Jean, Sengupta, Sohini, Kim, Dewey, Cline, Melissa, Turner, Tychele, Diekhans, Mark, Zaucha, Jan, Pal, Lipika R, Cao, Chen, Yu, Chen‐Hsin, Yin, Yizhou, Carraro, Marco, Giollo, Manuel, Ferrari, Carlo, Leonardi, Emanuela, Tosatto, Silvio CE, Bobe, Jason, Ball, Madeleine, Hoskins, Roger A, Repo, Susanna, Church, George, Brenner, Steven E, Moult, John, Gough, Julian, Stanke, Mario, Karchin, Rachel, and Mooney, Sean D
- Subjects
Biological Sciences ,Bioinformatics and Computational Biology ,Genetics ,Human Genome ,Biotechnology ,Good Health and Well Being ,Area Under Curve ,Genetic Predisposition to Disease ,High-Throughput Nucleotide Sequencing ,Human Genome Project ,Humans ,Phenotype ,Quantitative Trait Loci ,Whole Genome Sequencing ,biomedical informatics ,community challenge ,critical assessment ,genome ,genome interpretation ,open consent ,personal genome project ,phenotype ,Clinical Sciences ,Genetics & Heredity ,Clinical sciences - Abstract
The advent of next-generation sequencing has dramatically decreased the cost for whole-genome sequencing and increased the viability for its application in research and clinical care. The Personal Genome Project (PGP) provides unrestricted access to genomes of individuals and their associated phenotypes. This resource enabled the Critical Assessment of Genome Interpretation (CAGI) to create a community challenge to assess the bioinformatics community's ability to predict traits from whole genomes. In the CAGI PGP challenge, researchers were asked to predict whether an individual had a particular trait or profile based on their whole genome. Several approaches were used to assess submissions, including ROC AUC (area under receiver operating characteristic curve), probability rankings, the number of correct predictions, and statistical significance simulations. Overall, we found that prediction of individual traits is difficult, relying on a strong knowledge of trait frequency within the general population, whereas matching genomes to trait profiles relies heavily upon a small number of common traits including ancestry, blood type, and eye color. When a rare genetic disorder is present, profiles can be matched when one or more pathogenic variants are identified. Prediction accuracy has improved substantially over the last 6 years due to improved methodology and a better understanding of features.
- Published
- 2017
26. SCOPe: Manual Curation and Artifact Removal in the Structural Classification of Proteins – extended Database
- Author
-
Chandonia, John-Marc, Fox, Naomi K, and Brenner, Steven E
- Subjects
Biochemistry and Cell Biology ,Bioinformatics and Computational Biology ,Biological Sciences ,1.5 Resources and infrastructure (underpinning) ,Artifacts ,Cloning ,Molecular ,Computational Biology ,Databases ,Protein ,Mutation ,Protein Structure ,Tertiary ,Proteins ,structure classification ,protein evolution ,database ,Medicinal and Biomolecular Chemistry ,Microbiology ,Biochemistry & Molecular Biology ,Biochemistry and cell biology - Abstract
SCOPe (Structural Classification of Proteins-extended, http://scop.berkeley.edu) is a database of relationships between protein structures that extends the Structural Classification of Proteins (SCOP) database. SCOP is an expert-curated ordering of domains from the majority of proteins of known structure in a hierarchy according to structural and evolutionary relationships. SCOPe classifies the majority of protein structures released since SCOP development concluded in 2009, using a combination of manual curation and highly precise automated tools, aiming to have the same accuracy as fully hand-curated SCOP releases. SCOPe also incorporates and updates the ASTRAL compendium, which provides several databases and tools to aid in the analysis of the sequences and structures of proteins classified in SCOPe. SCOPe continues high-quality manual classification of new superfamilies, a key feature of SCOP. Artifacts such as expression tags are now separated into their own class, in order to distinguish them from the homology-based annotations in the remainder of the SCOPe hierarchy. SCOPe 2.06 contains 77,439 Protein Data Bank entries, double the 38,221 structures classified in SCOP.
- Published
- 2017
27. Newborn Sequencing in Genomic Medicine and Public Health
- Author
-
Berg, Jonathan S, Agrawal, Pankaj B, Bailey, Donald B, Beggs, Alan H, Brenner, Steven E, Brower, Amy M, Cakici, Julie A, Ceyhan-Birsoy, Ozge, Chan, Kee, Chen, Flavia, Currier, Robert J, Dukhovny, Dmitry, Green, Robert C, Harris-Wai, Julie, Holm, Ingrid A, Iglesias, Brenda, Joseph, Galen, Kingsmore, Stephen F, Koenig, Barbara A, Kwok, Pui-Yan, Lantos, John, Leeder, Steven J, Lewis, Megan A, McGuire, Amy L, Milko, Laura V, Mooney, Sean D, Parad, Richard B, Pereira, Stacey, Petrikin, Joshua, Powell, Bradford C, Powell, Cynthia M, Puck, Jennifer M, Rehm, Heidi L, Risch, Neil, Roche, Myra, Shieh, Joseph T, Veeraraghavan, Narayanan, Watson, Michael S, Willig, Laurel, Yu, Timothy W, Urv, Tiina, and Wise, Anastasia L
- Subjects
Health Services and Systems ,Health Sciences ,Health Services ,Clinical Research ,Human Genome ,Genetics ,Biotechnology ,Pediatric ,4.2 Evaluation of markers and technologies ,4.1 Discovery and preclinical testing of markers and technologies ,8.3 Policy ,ethics ,and research governance ,Generic health relevance ,Good Health and Well Being ,Exome ,Genetic Carrier Screening ,Genetic Research ,Genetic Testing ,Genome-Wide Association Study ,Genomic Structural Variation ,Humans ,Infant ,Newborn ,Intensive Care Units ,Neonatal ,Neonatal Screening ,Predictive Value of Tests ,Prospective Studies ,Public Health ,Sequence Analysis ,DNA ,United States ,Medical and Health Sciences ,Psychology and Cognitive Sciences ,Pediatrics ,Biomedical and clinical sciences ,Health sciences ,Psychology - Abstract
The rapid development of genomic sequencing technologies has decreased the cost of genetic analysis to the extent that it seems plausible that genome-scale sequencing could have widespread availability in pediatric care. Genomic sequencing provides a powerful diagnostic modality for patients who manifest symptoms of monogenic disease and an opportunity to detect health conditions before their development. However, many technical, clinical, ethical, and societal challenges should be addressed before such technology is widely deployed in pediatric practice. This article provides an overview of the Newborn Sequencing in Genomic Medicine and Public Health Consortium, which is investigating the application of genome-scale sequencing in newborns for both diagnosis and screening.
- Published
- 2017
28. An expanded evaluation of protein function prediction methods shows an improvement in accuracy
- Author
-
Jiang, Yuxiang, Oron, Tal Ronnen, Clark, Wyatt T, Bankapur, Asma R, D'Andrea, Daniel, Lepore, Rosalba, Funk, Christopher S, Kahanda, Indika, Verspoor, Karin M, Ben-Hur, Asa, Koo, Emily, Penfold-Brown, Duncan, Shasha, Dennis, Youngs, Noah, Bonneau, Richard, Lin, Alexandra, Sahraeian, Sayed ME, Martelli, Pier Luigi, Profiti, Giuseppe, Casadio, Rita, Cao, Renzhi, Zhong, Zhaolong, Cheng, Jianlin, Altenhoff, Adrian, Skunca, Nives, Dessimoz, Christophe, Dogan, Tunca, Hakala, Kai, Kaewphan, Suwisa, Mehryary, Farrokh, Salakoski, Tapio, Ginter, Filip, Fang, Hai, Smithers, Ben, Oates, Matt, Gough, Julian, Törönen, Petri, Koskinen, Patrik, Holm, Liisa, Chen, Ching-Tai, Hsu, Wen-Lian, Bryson, Kevin, Cozzetto, Domenico, Minneci, Federico, Jones, David T, Chapman, Samuel, C., Dukka B K., Khan, Ishita K, Kihara, Daisuke, Ofer, Dan, Rappoport, Nadav, Stern, Amos, Cibrian-Uhalte, Elena, Denny, Paul, Foulger, Rebecca E, Hieta, Reija, Legge, Duncan, Lovering, Ruth C, Magrane, Michele, Melidoni, Anna N, Mutowo-Meullenet, Prudence, Pichler, Klemens, Shypitsyna, Aleksandra, Li, Biao, Zakeri, Pooya, ElShal, Sarah, Tranchevent, Léon-Charles, Das, Sayoni, Dawson, Natalie L, Lee, David, Lees, Jonathan G, Sillitoe, Ian, Bhat, Prajwal, Nepusz, Tamás, Romero, Alfonso E, Sasidharan, Rajkumar, Yang, Haixuan, Paccanaro, Alberto, Gillis, Jesse, Sedeño-Cortés, Adriana E, Pavlidis, Paul, Feng, Shou, Cejuela, Juan M, Goldberg, Tatyana, Hamp, Tobias, Richter, Lothar, Salamov, Asaf, Gabaldon, Toni, Marcet-Houben, Marina, Supek, Fran, Gong, Qingtian, Ning, Wei, Zhou, Yuanpeng, Tian, Weidong, Falda, Marco, Fontana, Paolo, Lavezzo, Enrico, Toppo, Stefano, Ferrari, Carlo, Giollo, Manuel, Piovesan, Damiano, Tosatto, Silvio, del Pozo, Angela, Fernández, José M, Maietta, Paolo, Valencia, Alfonso, Tress, Michael L, Benso, Alfredo, Di Carlo, Stefano, Politano, Gianfranco, Savino, Alessandro, Rehman, Hafeez Ur, Re, Matteo, Mesiti, Marco, Valentini, Giorgio, Bargsten, Joachim W, van Dijk, Aalt DJ, Gemovic, Branislava, Glisic, Sanja, Perovic, Vladmir, Veljkovic, Veljko, Veljkovic, Nevena, Almeida-e-Silva, Danillo C, Vencio, Ricardo ZN, Sharan, Malvika, Vogel, Jörg, Kansakar, Lakesh, Zhang, Shanshan, Vucetic, Slobodan, Wang, Zheng, Sternberg, Michael JE, Wass, Mark N, Huntley, Rachael P, Martin, Maria J, O'Donovan, Claire, Robinson, Peter N, Moreau, Yves, Tramontano, Anna, Babbitt, Patricia C, Brenner, Steven E, Linial, Michal, Orengo, Christine A, Rost, Burkhard, Greene, Casey S, Mooney, Sean D, Friedberg, Iddo, and Radivojac, Predrag
- Subjects
Quantitative Biology - Quantitative Methods - Abstract
Background: The increasing volume and variety of genotypic and phenotypic data is a major defining characteristic of modern biomedical sciences. At the same time, the limitations in technology for generating data and the inherently stochastic nature of biomolecular events have led to the discrepancy between the volume of data and the amount of knowledge gleaned from it. A major bottleneck in our ability to understand the molecular underpinnings of life is the assignment of function to biological macromolecules, especially proteins. While molecular experiments provide the most reliable annotation of proteins, their relatively low throughput and restricted purview have led to an increasing role for computational function prediction. However, accurately assessing methods for protein function prediction and tracking progress in the field remain challenging. Methodology: We have conducted the second Critical Assessment of Functional Annotation (CAFA), a timed challenge to assess computational methods that automatically assign protein function. One hundred twenty-six methods from 56 research groups were evaluated for their ability to predict biological functions using the Gene Ontology and gene-disease associations using the Human Phenotype Ontology on a set of 3,681 proteins from 18 species. CAFA2 featured significantly expanded analysis compared with CAFA1, with regards to data set size, variety, and assessment metrics. To review progress in the field, the analysis also compared the best methods participating in CAFA1 to those of CAFA2. Conclusions: The top performing methods in CAFA2 outperformed the best methods from CAFA1, demonstrating that computational function prediction is improving. This increased accuracy can be attributed to the combined effect of the growing number of experimental annotations and improved methods for function prediction., Comment: Submitted to Genome Biology
- Published
- 2016
- Full Text
- View/download PDF
29. Multisystem Anomalies in Severe Combined Immunodeficiency with Mutant BCL11B.
- Author
-
Punwani, Divya, Zhang, Yong, Yu, Jason, Cowan, Morton J, Rana, Sadhna, Kwan, Antonia, Adhikari, Aashish N, Lizama, Carlos O, Mendelsohn, Bryce A, Fahl, Shawn P, Chellappan, Ajithavalli, Srinivasan, Rajgopal, Brenner, Steven E, Wiest, David L, and Puck, Jennifer M
- Subjects
Brain ,Hematopoietic Stem Cells ,Animals ,Zebrafish ,Humans ,Abnormalities ,Multiple ,Severe Combined Immunodeficiency ,Disease Models ,Animal ,Receptors ,Antigen ,T-Cell ,Tumor Suppressor Proteins ,Repressor Proteins ,Magnetic Resonance Imaging ,Neonatal Screening ,Hematopoietic Stem Cell Transplantation ,Cell Movement ,Gene Expression Regulation ,Mutation ,Missense ,Infant ,Newborn ,Male ,In Vitro Techniques ,Genetics ,Stem Cell Research - Nonembryonic - Human ,Transplantation ,Regenerative Medicine ,Stem Cell Research ,Pediatric ,Rare Diseases ,Clinical Research ,Infant Mortality ,Pediatric Research Initiative ,Inflammatory and immune system ,Blood ,Medical and Health Sciences ,General & Internal Medicine - Abstract
BackgroundSevere combined immunodeficiency (SCID) is characterized by arrested T-lymphocyte production and by B-lymphocyte dysfunction, which result in life-threatening infections. Early diagnosis of SCID through population-based screening of newborns can aid clinical management and help improve outcomes; it also permits the identification of previously unknown factors that are essential for lymphocyte development in humans.MethodsSCID was detected in a newborn before the onset of infections by means of screening of T-cell-receptor excision circles, a biomarker for thymic output. On confirmation of the condition, the affected infant was treated with allogeneic hematopoietic stem-cell transplantation. Exome sequencing in the patient and parents was followed by functional analysis of a prioritized candidate gene with the use of human hematopoietic stem cells and zebrafish embryos.ResultsThe infant had "leaky" SCID (i.e., a form of SCID in which a minimal degree of immune function is preserved), as well as craniofacial and dermal abnormalities and the absence of a corpus callosum; his immune deficit was fully corrected by hematopoietic stem-cell transplantation. Exome sequencing revealed a heterozygous de novo missense mutation, p.N441K, in BCL11B. The resulting BCL11B protein had dominant negative activity, which abrogated the ability of wild-type BCL11B to bind DNA, thereby arresting development of the T-cell lineage and disrupting hematopoietic stem-cell migration; this revealed a previously unknown function of BCL11B. The patient's abnormalities, when recapitulated in bcl11ba-deficient zebrafish, were reversed by ectopic expression of functionally intact human BCL11B but not mutant human BCL11B.ConclusionsNewborn screening facilitated the identification and treatment of a previously unknown cause of human SCID. Coupling exome sequencing with an evaluation of candidate genes in human hematopoietic stem cells and in zebrafish revealed that a constitutional BCL11B mutation caused human multisystem anomalies with SCID and also revealed a prethymic role for BCL11B in hematopoietic progenitors. (Funded by the National Institutes of Health and others.).
- Published
- 2016
30. Substrate specificity characterization for eight putative nudix hydrolases. Evaluation of criteria for substrate identification within the Nudix family
- Author
-
Nguyen, Vi N, Park, Annsea, Xu, Anting, Srouji, John R, Brenner, Steven E, and Kirsch, Jack F
- Subjects
Biochemistry and Cell Biology ,Biological Sciences ,Adenosine Diphosphate Ribose ,Bacillus ,Bacterial Proteins ,Cloning ,Molecular ,Clostridium perfringens ,Deoxyadenine Nucleotides ,Deoxyguanine Nucleotides ,Dinucleoside Phosphates ,Escherichia coli ,Gene Expression ,Kinetics ,Listeria ,Multigene Family ,Pyrophosphatases ,Recombinant Proteins ,Substrate Specificity ,kinetics ,physiological substrate ,Nudix ,substrate screening ,Mathematical Sciences ,Information and Computing Sciences ,Bioinformatics ,Biological sciences ,Mathematical sciences - Abstract
The nearly 50,000 known Nudix proteins have a diverse array of functions, of which the most extensively studied is the catalyzed hydrolysis of aberrant nucleotide triphosphates. The functions of 171 Nudix proteins have been characterized to some degree, although physiological relevance of the assayed activities has not always been conclusively demonstrated. We investigated substrate specificity for eight structurally characterized Nudix proteins, whose functions were unknown. These proteins were screened for hydrolase activity against a 74-compound library of known Nudix enzyme substrates. We found substrates for four enzymes with kcat /Km values >10,000 M-1 s-1 : Q92EH0_LISIN of Listeria innocua serovar 6a against ADP-ribose, Q5LBB1_BACFN of Bacillus fragilis against 5-Me-CTP, and Q0TTC5_CLOP1 and Q0TS82_CLOP1 of Clostridium perfringens against 8-oxo-dATP and 3'-dGTP, respectively. To ascertain whether these identified substrates were physiologically relevant, we surveyed all reported Nudix hydrolytic activities against NTPs. Twenty-two Nudix enzymes are reported to have activity against canonical NTPs. With a single exception, we find that the reported kcat /Km values exhibited against these canonical substrates are well under 105 M-1 s-1 . By contrast, several Nudix enzymes show much larger kcat /Km values (in the range of 105 to >107 M-1 s-1 ) against noncanonical NTPs. We therefore conclude that hydrolytic activities exhibited by these enzymes against canonical NTPs are not likely their physiological function, but rather the result of unavoidable collateral damage occasioned by the enzymes' inability to distinguish completely between similar substrate structures. Proteins 2016; 84:1810-1822. © 2016 Wiley Periodicals, Inc.
- Published
- 2016
31. Quantitative Tagless Copurification: A Method to Validate and Identify Protein-Protein Interactions*
- Author
-
Shatsky, Maxim, Dong, Ming, Liu, Haichuan, Yang, Lee Lisheng, Choi, Megan, Singer, Mary E, Geller, Jil T, Fisher, Susan J, Hall, Steven C, Hazen, Terry C, Brenner, Steven E, Butland, Gareth, Jin, Jian, Witkowska, H Ewa, Chandonia, John-Marc, and Biggin, Mark D
- Subjects
Analytical Chemistry ,Biological Sciences ,Bioinformatics and Computational Biology ,Chemical Sciences ,Bacterial Proteins ,Chromatography ,Affinity ,Desulfovibrio vulgaris ,Escherichia coli ,Mass Spectrometry ,Protein Interaction Mapping ,Protein Interaction Maps ,Proteomics ,Biochemistry & Molecular Biology - Abstract
Identifying protein-protein interactions (PPIs) at an acceptable false discovery rate (FDR) is challenging. Previously we identified several hundred PPIs from affinity purification - mass spectrometry (AP-MS) data for the bacteria Escherichia coli and Desulfovibrio vulgaris These two interactomes have lower FDRs than any of the nine interactomes proposed previously for bacteria and are more enriched in PPIs validated by other data than the nine earlier interactomes. To more thoroughly determine the accuracy of ours or other interactomes and to discover further PPIs de novo, here we present a quantitative tagless method that employs iTRAQ MS to measure the copurification of endogenous proteins through orthogonal chromatography steps. 5273 fractions from a four-step fractionation of a D. vulgaris protein extract were assayed, resulting in the detection of 1242 proteins. Protein partners from our D. vulgaris and E. coli AP-MS interactomes copurify as frequently as pairs belonging to three benchmark data sets of well-characterized PPIs. In contrast, the protein pairs from the nine other bacterial interactomes copurify two- to 20-fold less often. We also identify 200 high confidence D. vulgaris PPIs based on tagless copurification and colocalization in the genome. These PPIs are as strongly validated by other data as our AP-MS interactomes and overlap with our AP-MS interactome for D.vulgaris within 3% of expectation, once FDRs and false negative rates are taken into account. Finally, we reanalyzed data from two quantitative tagless screens of human cell extracts. We estimate that the novel PPIs reported in these studies have an FDR of at least 85% and find that less than 7% of the novel PPIs identified in each screen overlap. Our results establish that a quantitative tagless method can be used to validate and identify PPIs, but that such data must be analyzed carefully to minimize the FDR.
- Published
- 2016
32. Bacterial Interactomes: Interacting Protein Partners Share Similar Function and Are Validated in Independent Assays More Frequently Than Previously Reported*
- Author
-
Shatsky, Maxim, Allen, Simon, Gold, Barbara L, Liu, Nancy L, Juba, Thomas R, Reveco, Sonia A, Elias, Dwayne A, Prathapam, Ramadevi, He, Jennifer, Yang, Wenhong, Szakal, Evelin D, Liu, Haichuan, Singer, Mary E, Geller, Jil T, Lam, Bonita R, Saini, Avneesh, Trotter, Valentine V, Hall, Steven C, Fisher, Susan J, Brenner, Steven E, Chhabra, Swapnil R, Hazen, Terry C, Wall, Judy D, Witkowska, H Ewa, Biggin, Mark D, Chandonia, John-Marc, and Butland, Gareth
- Subjects
Biochemistry and Cell Biology ,Biological Sciences ,Genetics ,Bacterial Proteins ,Chromatography ,Affinity ,Computational Biology ,Databases ,Protein ,Desulfovibrio vulgaris ,Escherichia coli ,Mass Spectrometry ,Protein Interaction Mapping ,Protein Interaction Maps ,Proteomics ,Two-Hybrid System Techniques ,Biochemistry & Molecular Biology - Abstract
Numerous affinity purification-mass spectrometry (AP-MS) and yeast two-hybrid screens have each defined thousands of pairwise protein-protein interactions (PPIs), most of which are between functionally unrelated proteins. The accuracy of these networks, however, is under debate. Here, we present an AP-MS survey of the bacterium Desulfovibrio vulgaris together with a critical reanalysis of nine published bacterial yeast two-hybrid and AP-MS screens. We have identified 459 high confidence PPIs from D. vulgaris and 391 from Escherichia coli Compared with the nine published interactomes, our two networks are smaller, are much less highly connected, and have significantly lower false discovery rates. In addition, our interactomes are much more enriched in protein pairs that are encoded in the same operon, have similar functions, and are reproducibly detected in other physical interaction assays than the pairs reported in prior studies. Our work establishes more stringent benchmarks for the properties of protein interactomes and suggests that bona fide PPIs much more frequently involve protein partners that are annotated with similar functions or that can be validated in independent assays than earlier studies suggested.
- Published
- 2016
33. A novel human autoimmune syndrome caused by combined hypomorphic and activating mutations in ZAP-70
- Author
-
Chan, Alice Y, Punwani, Divya, Kadlecek, Theresa A, Cowan, Morton J, Olson, Jean L, Mathes, Erin F, Sunderam, Uma, Fu, Shu Man, Srinivasan, Rajgopal, Kuriyan, John, Brenner, Steven E, Weiss, Arthur, and Puck, Jennifer M
- Subjects
Biomedical and Clinical Sciences ,Pediatric ,Genetics ,Clinical Research ,Rare Diseases ,Autoimmune Disease ,2.1 Biological and endogenous factors ,Inflammatory and immune system ,Amino Acid Sequence ,Animals ,Autoimmune Diseases ,Base Sequence ,Cell Line ,Child ,Preschool ,Female ,Hematopoietic Stem Cell Transplantation ,Hemophilia A ,Heterozygote ,Humans ,Infant ,Male ,Mice ,Models ,Molecular ,Molecular Sequence Data ,Mutant Proteins ,Mutation ,Missense ,Pedigree ,Pemphigoid ,Bullous ,Phenotype ,Protein Conformation ,Receptors ,Antigen ,T-Cell ,Severe Combined Immunodeficiency ,Siblings ,Syndrome ,T-Lymphocytes ,Transplantation ,Homologous ,ZAP-70 Protein-Tyrosine Kinase ,Medical and Health Sciences ,Immunology ,Biomedical and clinical sciences ,Health sciences - Abstract
A brother and sister developed a previously undescribed constellation of autoimmune manifestations within their first year of life, with uncontrollable bullous pemphigoid, colitis, and proteinuria. The boy had hemophilia due to a factor VIII autoantibody and nephrotic syndrome. Both children required allogeneic hematopoietic cell transplantation (HCT), which resolved their autoimmunity. The early onset, severity, and distinctive findings suggested a single gene disorder underlying the phenotype. Whole-exome sequencing performed on five family members revealed the affected siblings to be compound heterozygous for two unique missense mutations in the 70-kD T cell receptor ζ-chain associated protein (ZAP-70). Healthy relatives were heterozygous mutation carriers. Although pre-HCT patient T cells were not available, mutation effects were determined using transfected cell lines and peripheral blood from carriers and controls. Mutation R192W in the C-SH2 domain exhibited reduced binding to phosphorylated ζ-chain, whereas mutation R360P in the N lobe of the catalytic domain disrupted an autoinhibitory mechanism, producing a weakly hyperactive ZAP-70 protein. Although human ZAP-70 deficiency can have dysregulated T cells, and autoreactive mouse thymocytes with weak Zap-70 signaling can escape tolerance, our patients' combination of hypomorphic and activating mutations suggested a new disease mechanism and produced previously undescribed human ZAP-70-associated autoimmune disease.
- Published
- 2016
34. The DOE Systems Biology Knowledgebase (KBase)
- Author
-
Arkin, Adam P, Stevens, Rick L, Cottingham, Robert W, Maslov, Sergei, Henry, Christopher S, Dehal, Paramvir, Ware, Doreen, Perez, Fernando, Harris, Nomi L, Canon, Shane, Sneddon, Michael W, Henderson, Matthew L, Riehl, William J, Gunter, Dan, Murphy-Olson, Dan, Chan, Stephen, Kamimura, Roy T, Brettin, Thomas S, Meyer, Folker, Chivian, Dylan, Weston, David J, Glass, Elizabeth M, Davison, Brian H, Kumari, Sunita, Allen, Benjamin H, Baumohl, Jason, Best, Aaron A, Bowen, Ben, Brenner, Steven E, Bun, Christopher C, Chandonia, John-Marc, Chia, Jer-Ming, Colasanti, Ric, Conrad, Neal, Davis, James J, DeJongh, Matthew, Devoid, Scott, Dietrich, Emily, Drake, Meghan M, Dubchak, Inna, Edirisinghe, Janaka N, Fang, Gang, Faria, José P, Frybarger, Paul M, Gerlach, Wolfgang, Gerstein, Mark, Gurtowski, James, Haun, Holly L, He, Fei, Jain, Rashmi, Joachimiak, Marcin P, Keegan, Kevin P, Kondo, Shinnosuke, Kumar, Vivek, Land, Miriam L, Mills, Marissa, Novichkov, Pavel, Oh, Taeyun, Olsen, Gary J, Olson, Bob, Parrello, Bruce, Pasternak, Shiran, Pearson, Erik, Poon, Sarah S, Price, Gavin A, Ramakrishnan, Srividya, Ranjan, Priya, Ronald, Pamela C, Schatz, Michael C, Seaver, Samuel MD, Shukla, Maulik, Sutormin, Roman A, Syed, Mustafa H, Thomason, James, Tintle, Nathan L, Wang, Daifeng, Xia, Fangfang, Yoo, Hyunseung, and Yoo, Shinjae
- Subjects
Bioengineering ,Genetics ,Human Genome ,Networking and Information Technology R&D (NITRD) ,Generic health relevance - Abstract
The U.S. Department of Energy Systems Biology Knowledgebase (KBase) is an open-source software and data platform designed to meet the grand challenge of systems biology — predicting and designing biological function from the biomolecular (small scale) to the ecological (large scale). KBase is available for anyone to use, and enables researchers to collaboratively generate, test, compare, and share hypotheses about biological functions; perform large-scale analyses on scalable computing infrastructure; and combine experimental evidence and conclusions that lead to accurate models of plant and microbial physiology and community dynamics. The KBase platform has (1) extensible analytical capabilities that currently include genome assembly, annotation, ontology assignment, comparative genomics, transcriptomics, and metabolic modeling; (2) a web-browser-based user interface that supports building, sharing, and publishing reproducible and well-annotated analyses with integrated data; (3) access to extensive computational resources; and (4) a software development kit allowing the community to add functionality to the system.
- Published
- 2016
35. Multiple breast cancer risk variants are associated with differential transcript isoform expression in tumors
- Author
-
Caswell, Jennifer L, Camarda, Roman, Zhou, Alicia Y, Huntsman, Scott, Hu, Donglei, Brenner, Steven E, Zaitlen, Noah, Goga, Andrei, and Ziv, Elad
- Subjects
Biological Sciences ,Bioinformatics and Computational Biology ,Genetics ,Clinical Research ,Breast Cancer ,Women's Health ,Health Disparities ,Prevention ,Cancer ,Cancer Genomics ,Human Genome ,Minority Health ,2.1 Biological and endogenous factors ,Alternative Splicing ,Bayes Theorem ,Breast Neoplasms ,Cell Line ,Genetic Predisposition to Disease ,Humans ,Polymorphism ,Single Nucleotide ,Protein Isoforms ,Quantitative Trait Loci ,Medical and Health Sciences ,Genetics & Heredity - Abstract
Genome-wide association studies have identified over 70 single-nucleotide polymorphisms (SNPs) associated with breast cancer. A subset of these SNPs are associated with quantitative expression of nearby genes, but the functional effects of the majority remain unknown. We hypothesized that some risk SNPs may regulate alternative splicing. Using RNA-sequencing data from breast tumors and germline genotypes from The Cancer Genome Atlas, we tested the association between each risk SNP genotype and exon-, exon-exon junction- or transcript-specific expression of nearby genes. Six SNPs were associated with differential transcript expression of seven nearby genes at FDR < 0.05 (BABAM1, DCLRE1B/PHTF1, PEX14, RAD51L1, SRGAP2D and STXBP4). We next developed a Bayesian approach to evaluate, for each SNP, the overlap between the signal of association with breast cancer and the signal of association with alternative splicing. At one locus (SRGAP2D), this method eliminated the possibility that the breast cancer risk and the alternate splicing event were due to the same causal SNP. Lastly, at two loci, we identified the likely causal SNP for the alternative splicing event, and at one, functionally validated the effect of that SNP on alternative splicing using a minigene reporter assay. Our results suggest that the regulation of differential transcript isoform expression is the functional mechanism of some breast cancer risk SNPs and that we can use these associations to identify causal SNPs, target genes and the specific transcripts that may mediate breast cancer risk.
- Published
- 2015
36. Regulation of alternative splicing in Drosophila by 56 RNA binding proteins
- Author
-
Brooks, Angela N, Duff, Michael O, May, Gemma, Yang, Li, Bolisetty, Mohan, Landolin, Jane, Wan, Ken, Sandler, Jeremy, Booth, Benjamin W, Celniker, Susan E, Graveley, Brenton R, and Brenner, Steven E
- Subjects
Biological Sciences ,Bioinformatics and Computational Biology ,Genetics ,1.1 Normal biological development and functioning ,Generic health relevance ,Alternative Splicing ,Animals ,Drosophila ,Drosophila Proteins ,Exons ,Heterogeneous-Nuclear Ribonucleoproteins ,RNA Interference ,RNA Precursors ,RNA-Binding Proteins ,Sequence Analysis ,RNA ,TATA-Binding Protein Associated Factors ,Medical and Health Sciences ,Bioinformatics - Abstract
Alternative splicing is regulated by RNA binding proteins (RBPs) that recognize pre-mRNA sequence elements and activate or repress adjacent exons. Here, we used RNA interference and RNA-seq to identify splicing events regulated by 56 Drosophila proteins, some previously unknown to regulate splicing. Nearly all proteins affected alternative first exons, suggesting that RBPs play important roles in first exon choice. Half of the splicing events were regulated by multiple proteins, demonstrating extensive combinatorial regulation. We observed that SR and hnRNP proteins tend to act coordinately with each other, not antagonistically. We also identified a cross-regulatory network where splicing regulators affected the splicing of pre-mRNAs encoding other splicing regulators. This large-scale study substantially enhances our understanding of recent models of splicing regulation and provides a resource of thousands of exons that are regulated by 56 diverse RBPs.
- Published
- 2015
37. The value of protein structure classification information—Surveying the scientific literature
- Author
-
Fox, Naomi K, Brenner, Steven E, and Chandonia, John‐Marc
- Subjects
Biological Sciences ,Bioinformatics and Computational Biology ,1.5 Resources and infrastructure (underpinning) ,Algorithms ,Computational Biology ,Databases ,Protein ,Protein Conformation ,Proteins ,SCOP ,CATH ,database ,curation ,resources ,Mathematical Sciences ,Information and Computing Sciences ,Bioinformatics ,Biological sciences ,Mathematical sciences - Abstract
The Structural Classification of Proteins (SCOP) and Class, Architecture, Topology, Homology (CATH) databases have been valuable resources for protein structure classification for over 20 years. Development of SCOP (version 1) concluded in June 2009 with SCOP 1.75. The SCOPe (SCOP-extended) database offers continued development of the classic SCOP hierarchy, adding over 33,000 structures. We have attempted to assess the impact of these two decade old resources and guide future development. To this end, we surveyed recent articles to learn how structure classification data are used. Of 571 articles published in 2012-2013 that cite SCOP, 439 actually use data from the resource. We found that the type of use was fairly evenly distributed among four top categories: A) study protein structure or evolution (27% of articles), B) train and/or benchmark algorithms (28% of articles), C) augment non-SCOP datasets with SCOP classification (21% of articles), and D) examine the classification of one protein/a small set of proteins (22% of articles). Most articles described computational research, although 11% described purely experimental research, and a further 9% included both. We examined how CATH and SCOP were used in 158 articles that cited both databases: while some studies used only one dataset, the majority used data from both resources. Protein structure classification remains highly relevant for a diverse range of problems and settings.
- Published
- 2015
38. SIFTER search: a web server for accurate phylogeny-based protein function prediction
- Author
-
Sahraeian, Sayed M, Luo, Kevin R, and Brenner, Steven E
- Subjects
Bioengineering ,Networking and Information Technology R&D (NITRD) ,Generic health relevance ,Algorithms ,Internet ,Models ,Statistical ,Molecular Sequence Annotation ,Phylogeny ,Protein Structure ,Tertiary ,Proteins ,Sequence Homology ,Amino Acid ,Software ,Environmental Sciences ,Biological Sciences ,Information and Computing Sciences ,Developmental Biology - Abstract
We are awash in proteins discovered through high-throughput sequencing projects. As only a minuscule fraction of these have been experimentally characterized, computational methods are widely used for automated annotation. Here, we introduce a user-friendly web interface for accurate protein function prediction using the SIFTER algorithm. SIFTER is a state-of-the-art sequence-based gene molecular function prediction algorithm that uses a statistical model of function evolution to incorporate annotations throughout the phylogenetic tree. Due to the resources needed by the SIFTER algorithm, running SIFTER locally is not trivial for most users, especially for large-scale problems. The SIFTER web server thus provides access to precomputed predictions on 16 863 537 proteins from 232 403 species. Users can explore SIFTER predictions with queries for proteins, species, functions, and homologs of sequences not in the precomputed prediction set. The SIFTER web server is accessible at http://sifter.berkeley.edu/ and the source code can be downloaded.
- Published
- 2015
39. Ten Years of PLoS Computational Biology: A Decade of Appreciation and Innovation.
- Author
-
Bourne, Philip E, Brenner, Steven E, and Eisen, Michael B
- Subjects
Humans ,Computational Biology ,Periodicals as Topic ,Mathematical Sciences ,Biological Sciences ,Information and Computing Sciences ,Bioinformatics - Published
- 2015
40. Regulation of Splicing Factors by Alternative Splicing and NMD Is Conserved between Kingdoms Yet Evolutionarily Flexible
- Author
-
Lareau, Liana F and Brenner, Steven E
- Subjects
Biotechnology ,Genetics ,Generic health relevance ,Alternative Splicing ,Animals ,Base Sequence ,Evolution ,Molecular ,Fungi ,Humans ,Molecular Sequence Data ,Nonsense Mediated mRNA Decay ,RNA ,Messenger ,Serine-Arginine Splicing Factors ,alternative splicing ,nonsense mediated decay ,ultraconserved elements ,Biochemistry and Cell Biology ,Evolutionary Biology - Abstract
Ultraconserved elements, unusually long regions of perfect sequence identity, are found in genes encoding numerous RNA-binding proteins including arginine-serine rich (SR) splicing factors. Expression of these genes is regulated via alternative splicing of the ultraconserved regions to yield mRNAs that are degraded by nonsense-mediated mRNA decay (NMD), a process termed unproductive splicing (Lareau et al. 2007; Ni et al. 2007). As all human SR genes are affected by alternative splicing and NMD, one might expect this regulation to have originated in an early SR gene and persisted as duplications expanded the SR family. But in fact, unproductive splicing of most human SR genes arose independently (Lareau et al. 2007). This paradox led us to investigate the origin and proliferation of unproductive splicing in SR genes. We demonstrate that unproductive splicing of the splicing factor SRSF5 (SRp40) is conserved among all animals and even observed in fungi; this is a rare example of alternative splicing conserved between kingdoms, yet its effect is to trigger mRNA degradation. As the gene duplicated, the ancient unproductive splicing was lost in paralogs, and distinct unproductive splicing evolved rapidly and repeatedly to take its place. SR genes have consistently employed unproductive splicing, and while it is exceptionally conserved in some of these genes, turnover in specific events among paralogs shows flexible means to the same regulatory end.
- Published
- 2015
41. Nijmegen breakage syndrome detected by newborn screening for T cell receptor excision circles (TRECs).
- Author
-
Patel, Jay P, Puck, Jennifer M, Srinivasan, Rajgopal, Brown, Christina, Sunderam, Uma, Kundu, Kunal, Brenner, Steven E, Gatti, Richard A, and Church, Joseph A
- Subjects
T-Lymphocytes ,Humans ,Cell Cycle Proteins ,Receptors ,Antigen ,T-Cell ,Nuclear Proteins ,DNA ,Circular ,Neonatal Screening ,Gene Rearrangement ,T-Lymphocyte ,Infant ,Infant ,Newborn ,Male ,Nijmegen Breakage Syndrome ,High-Throughput Nucleotide Sequencing ,Exome ,exome sequencing ,nibrin ,Nijmegen breakage syndrome ,SCID ,T lymphopenia ,TREC ,Immunology - Abstract
PurposeSevere combined immunodeficiency (SCID) encompasses a group of disorders characterized by reduced or absent T-cell number and function and identified by newborn screening utilizing T-cell receptor excision circles (TRECs). This screening has also identified infants with T lymphopenia who lack mutations in typical SCID genes. We report an infant with low TRECs and non-SCID T lymphopenia, who proved upon whole exome sequencing to have Nijmegen breakage syndrome (NBS).MethodsExome sequencing of DNA from the infant and his parents was performed. Genomic analysis revealed deleterious variants in the NBN gene. Confirmatory testing included Sanger sequencing and immunoblotting and radiosensitivity testing of patient lymphocytes.ResultsTwo novel nonsense mutations in NBN were identified in genomic DNA from the family. Immunoblotting showed absence of nibrin protein. A colony survival assay demonstrated radiosensitivity comparable to patients with ataxia telangiectasia.ConclusionsAlthough TREC screening was developed to identify newborns with SCID, it has also identified T lymphopenic disorders that may not otherwise be diagnosed until later in life. Timely identification of an infant with T lymphopenia allowed for prompt pursuit of underlying etiology, making possible a diagnosis of NBS, genetic counseling, and early intervention to minimize complications.
- Published
- 2015
42. Combined Immunodeficiency Due to MALT1 Mutations, Treated by Hematopoietic Cell Transplantation
- Author
-
Punwani, Divya, Wang, Haopeng, Chan, Alice Y, Cowan, Morton J, Mallott, Jacob, Sunderam, Uma, Mollenauer, Marianne, Srinivasan, Rajgopal, Brenner, Steven E, Mulder, Arend, Claas, Frans HJ, Weiss, Arthur, and Puck, Jennifer M
- Subjects
Biomedical and Clinical Sciences ,Immunology ,Pediatric ,Genetics ,Biotechnology ,Regenerative Medicine ,Transplantation ,Clinical Research ,Rare Diseases ,2.1 Biological and endogenous factors ,5.2 Cellular and gene therapies ,Inflammatory and immune system ,Adult ,Amino Acid Sequence ,B-Lymphocytes ,Base Sequence ,Caspases ,Cell Line ,Transformed ,Child ,Child ,Preschool ,DNA Mutational Analysis ,Female ,Gene Expression ,Hematopoietic Stem Cell Transplantation ,Humans ,Immunophenotyping ,Infant ,Infant ,Newborn ,Leukocytes ,Mononuclear ,Male ,Mucosa-Associated Lymphoid Tissue Lymphoma Translocation 1 Protein ,Mutation ,NF-kappa B ,Neoplasm Proteins ,RNA ,Messenger ,Severe Combined Immunodeficiency ,Signal Transduction ,Skin ,Transplantation Chimera ,Transplantation ,Homologous ,BCL10 ,bone marrow transplant/hematopoietic cell transplant ,CARD11 ,CARMA1 ,combined immunodeficiency ,erythroderma ,immune dysregulation - Abstract
PurposeA male infant developed generalized rash, intestinal inflammation and severe infections including persistent cytomegalovirus. Family history was negative, T cell receptor excision circles were normal, and engraftment of maternal cells was absent. No defects were found in multiple genes associated with severe combined immunodeficiency. A 9/10 HLA matched unrelated hematopoietic cell transplant (HCT) led to mixed chimerism with clinical resolution. We sought an underlying cause for this patient's immune deficiency and dysregulation.MethodsClinical and laboratory features were reviewed. Whole exome sequencing and analysis of genomic DNA from the patient, parents and 2 unaffected siblings was performed, revealing 2 MALT1 variants. With a host-specific HLA-C antibody, we assessed MALT1 expression and function in the patient's post-HCT autologous and donor lymphocytes. Wild type MALT1 cDNA was added to transformed autologous patient B cells to assess functional correction.ResultsThe patient had compound heterozygous DNA variants affecting exon 10 of MALT1 (isoform a, NM_006785.3), a maternally inherited splice acceptor c.1019-2A > G, and a de novo deletion of c.1059C leading to a frameshift and premature termination. Autologous lymphocytes failed to express MALT1 and lacked NF-κB signaling dependent upon the CARMA1, BCL-10 and MALT1 signalosome. Transduction with wild type MALT1 cDNA corrected the observed defects.ConclusionsOur nonconsanguineous patient with early onset profound combined immunodeficiency and immune dysregulation due to compound heterozygous MALT1 mutations extends the clinical and immunologic phenotype reported in 2 prior families. Clinical cure was achieved with mixed chimerism after nonmyeloablative conditioning and HCT.
- Published
- 2015
43. Publisher Correction: Application of full-genome analysis to diagnose rare monogenic disorders
- Author
-
Shieh, Joseph T., Penon-Portmann, Monica, Wong, Karen H. Y., Levy-Sakin, Michal, Verghese, Michelle, Slavotinek, Anne, Gallagher, Renata C., Mendelsohn, Bryce A., Tenney, Jessica, Beleford, Daniah, Perry, Hazel, Chow, Stephen K., Sharo, Andrew G., Brenner, Steven E., Qi, Zhongxia, Yu, Jingwei, Klein, Ophir D., Martin, David, Kwok, Pui-Yan, and Boffelli, Dario
- Published
- 2021
- Full Text
- View/download PDF
44. Assessment of the evidence yield for the calibrated PP3/BP4 computational recommendations
- Author
-
Biesecker, Leslie G., Harrison, Steven M., Tayoun, Ahmad A., Berg, Jonathan S., Brenner, Steven E., Cutting, Garry R., Ellard, Sian, Greenblatt, Marc S., Kang, Peter, Karbassi, Izabela, Karchin, Rachel, Mester, Jessica, O’Donnell-Luria, Anne, Pesaran, Tina, Plon, Sharon E., Rehm, Heidi L., Strande, Natasha T., Tavtigian, Sean V., Topper, Scott, Stenton, Sarah L., Pejaver, Vikas, Bergquist, Timothy, Byrne, Alicia B., Nadeau, Emily A.W., and Radivojac, Predrag
- Published
- 2024
- Full Text
- View/download PDF
45. PERSONALIZED MEDICINE: FROM GENOTYPES, MOLECULAR PHENOTYPES AND THE QUANTIFIED SELF, TOWARDS IMPROVED MEDICINE
- Author
-
Altman, Russ B, Dunker, A Keith, Hunter, Lawrence, Ritchie, Marylyn D, Murray, Tiffany A, Klein, Teri E, DUDLEY, JOEL T, LISTGARTEN, JENNIFER, STEGLE, OLIVER, BRENNER, STEVEN E, and PARTS, LEOPOLD
- Subjects
Clinical Research ,Bioengineering ,Genetics ,Networking and Information Technology R&D (NITRD) ,Generic health relevance ,Good Health and Well Being ,Computational Biology ,Genotype ,Humans ,Patient-Specific Modeling ,Phenotype ,Precision Medicine ,Systems Biology - Abstract
Advances in molecular profiling and sensor technologies are expanding the scope of personalized medicine beyond genotypes, providing new opportunities for developing richer and more dynamic multi-scale models of individual health. Recent studies demonstrate the value of scoring high-dimensional microbiome, immune, and metabolic traits from individuals to inform personalized medicine. Efforts to integrate multiple dimensions of clinical and molecular data towards predictive multi-scale models of individual health and wellness are already underway. Improved methods for mining and discovery of clinical phenotypes from electronic medical records and technological developments in wearable sensor technologies present new opportunities for mapping and exploring the critical yet poorly characterized "phenome" and "envirome" dimensions of personalized medicine. There are ambitious new projects underway to collect multi-scale molecular, sensor, clinical, behavioral, and environmental data streams from large population cohorts longitudinally to enable more comprehensive and dynamic models of individual biology and personalized health. Personalized medicine stands to benefit from inclusion of rich new sources and dimensions of data. However, realizing these improvements in care relies upon novel informatics methodologies, tools, and systems to make full use of these data to advance both the science and translational applications of personalized medicine.
- Published
- 2014
46. Comparative analysis of regulatory information and circuits across distant species
- Author
-
Boyle, Alan P, Araya, Carlos L, Brdlik, Cathleen, Cayting, Philip, Cheng, Chao, Cheng, Yong, Gardner, Kathryn, Hillier, LaDeana W, Janette, Judith, Jiang, Lixia, Kasper, Dionna, Kawli, Trupti, Kheradpour, Pouya, Kundaje, Anshul, Li, Jingyi Jessica, Ma, Lijia, Niu, Wei, Rehm, E Jay, Rozowsky, Joel, Slattery, Matthew, Spokony, Rebecca, Terrell, Robert, Vafeados, Dionne, Wang, Daifeng, Weisdepp, Peter, Wu, Yi-Chieh, Xie, Dan, Yan, Koon-Kiu, Feingold, Elise A, Good, Peter J, Pazin, Michael J, Huang, Haiyan, Bickel, Peter J, Brenner, Steven E, Reinke, Valerie, Waterston, Robert H, Gerstein, Mark, White, Kevin P, Kellis, Manolis, and Snyder, Michael
- Subjects
Human Genome ,Genetics ,1.1 Normal biological development and functioning ,Underpinning research ,Generic health relevance ,Animals ,Binding Sites ,Caenorhabditis elegans ,Chromatin Immunoprecipitation ,Conserved Sequence ,Drosophila melanogaster ,Evolution ,Molecular ,Gene Expression Regulation ,Gene Expression Regulation ,Developmental ,Gene Regulatory Networks ,Genome ,Humans ,Molecular Sequence Annotation ,Nucleotide Motifs ,Organ Specificity ,Transcription Factors ,General Science & Technology - Abstract
Despite the large evolutionary distances between metazoan species, they can show remarkable commonalities in their biology, and this has helped to establish fly and worm as model organisms for human biology. Although studies of individual elements and factors have explored similarities in gene regulation, a large-scale comparative analysis of basic principles of transcriptional regulatory features is lacking. Here we map the genome-wide binding locations of 165 human, 93 worm and 52 fly transcription regulatory factors, generating a total of 1,019 data sets from diverse cell types, developmental stages, or conditions in the three species, of which 498 (48.9%) are presented here for the first time. We find that structural properties of regulatory networks are remarkably conserved and that orthologous regulatory factor families recognize similar binding motifs in vivo and show some similar co-associations. Our results suggest that gene-regulatory properties previously observed for individual factors are general principles of metazoan regulation that are remarkably well-preserved despite extensive functional divergence of individual network connections. The comparative maps of regulatory circuitry provided here will drive an improved understanding of the regulatory underpinnings of model organism biology and how these relate to human biology, development and disease.
- Published
- 2014
47. Comparative analysis of the transcriptome across distant species.
- Author
-
Gerstein, Mark B, Rozowsky, Joel, Yan, Koon-Kiu, Wang, Daifeng, Cheng, Chao, Brown, James B, Davis, Carrie A, Hillier, LaDeana, Sisu, Cristina, Li, Jingyi Jessica, Pei, Baikang, Harmanci, Arif O, Duff, Michael O, Djebali, Sarah, Alexander, Roger P, Alver, Burak H, Auerbach, Raymond, Bell, Kimberly, Bickel, Peter J, Boeck, Max E, Boley, Nathan P, Booth, Benjamin W, Cherbas, Lucy, Cherbas, Peter, Di, Chao, Dobin, Alex, Drenkow, Jorg, Ewing, Brent, Fang, Gang, Fastuca, Megan, Feingold, Elise A, Frankish, Adam, Gao, Guanjun, Good, Peter J, Guigó, Roderic, Hammonds, Ann, Harrow, Jen, Hoskins, Roger A, Howald, Cédric, Hu, Long, Huang, Haiyan, Hubbard, Tim JP, Huynh, Chau, Jha, Sonali, Kasper, Dionna, Kato, Masaomi, Kaufman, Thomas C, Kitchen, Robert R, Ladewig, Erik, Lagarde, Julien, Lai, Eric, Leng, Jing, Lu, Zhi, MacCoss, Michael, May, Gemma, McWhirter, Rebecca, Merrihew, Gennifer, Miller, David M, Mortazavi, Ali, Murad, Rabi, Oliver, Brian, Olson, Sara, Park, Peter J, Pazin, Michael J, Perrimon, Norbert, Pervouchine, Dmitri, Reinke, Valerie, Reymond, Alexandre, Robinson, Garrett, Samsonova, Anastasia, Saunders, Gary I, Schlesinger, Felix, Sethi, Anurag, Slack, Frank J, Spencer, William C, Stoiber, Marcus H, Strasbourger, Pnina, Tanzer, Andrea, Thompson, Owen A, Wan, Kenneth H, Wang, Guilin, Wang, Huaien, Watkins, Kathie L, Wen, Jiayu, Wen, Kejia, Xue, Chenghai, Yang, Li, Yip, Kevin, Zaleski, Chris, Zhang, Yan, Zheng, Henry, Brenner, Steven E, Graveley, Brenton R, Celniker, Susan E, Gingeras, Thomas R, and Waterston, Robert
- Subjects
Chromatin ,Animals ,Humans ,Drosophila melanogaster ,Caenorhabditis elegans ,Histones ,RNA ,Untranslated ,Cluster Analysis ,Gene Expression Profiling ,Sequence Analysis ,RNA ,Gene Expression Regulation ,Developmental ,Larva ,Pupa ,Models ,Genetic ,Promoter Regions ,Genetic ,Molecular Sequence Annotation ,Transcriptome ,Genetics ,Human Genome ,Generic health relevance ,General Science & Technology - Abstract
The transcriptome is the readout of the genome. Identifying common features in it across distant species can reveal fundamental principles. To this end, the ENCODE and modENCODE consortia have generated large amounts of matched RNA-sequencing data for human, worm and fly. Uniform processing and comprehensive annotation of these data allow comparison across metazoan phyla, extending beyond earlier within-phylum transcriptome comparisons and revealing ancient, conserved features. Specifically, we discover co-expression modules shared across animals, many of which are enriched in developmental genes. Moreover, we use expression patterns to align the stages in worm and fly development and find a novel pairing between worm embryo and fly pupae, in addition to the embryo-to-embryo and larvae-to-larvae pairings. Furthermore, we find that the extent of non-canonical, non-coding transcription is similar in each organism, per base pair. Finally, we find in all three organisms that the gene-expression levels, both coding and non-coding, can be quantitatively predicted from chromatin features at the promoter using a 'universal model' based on a single set of organism-independent parameters.
- Published
- 2014
48. Automated particle correspondence and accurate tilt-axis detection in tilted-image pairs
- Author
-
Shatsky, Maxim, Arbelaez, Pablo, Han, Bong-Gyoon, Typke, Dieter, Brenner, Steven E, Malik, Jitendra, and Glaeser, Robert M
- Subjects
Biochemistry and Cell Biology ,Biological Sciences ,Cryoelectron Microscopy ,Desulfovibrio vulgaris ,Escherichia coli ,IMP Dehydrogenase ,Image Processing ,Computer-Assisted ,Imaging ,Three-Dimensional ,Ribosomes ,Particle correspondence ,Tilted pairs ,Tilt-axis detection ,Zoology ,Biophysics ,Biochemistry and cell biology - Abstract
Tilted electron microscope images are routinely collected for an ab initio structure reconstruction as a part of the Random Conical Tilt (RCT) or Orthogonal Tilt Reconstruction (OTR) methods, as well as for various applications using the "free-hand" procedure. These procedures all require identification of particle pairs in two corresponding images as well as accurate estimation of the tilt-axis used to rotate the electron microscope (EM) grid. Here we present a computational approach, PCT (particle correspondence from tilted pairs), based on tilt-invariant context and projection matching that addresses both problems. The method benefits from treating the two problems as a single optimization task. It automatically finds corresponding particle pairs and accurately computes tilt-axis direction even in the cases when EM grid is not perfectly planar.
- Published
- 2014
49. Comparison of D. melanogaster and C. elegans developmental stages, tissues, and cells by modENCODE RNA-seq data
- Author
-
Li, Jingyi Jessica, Huang, Haiyan, Bickel, Peter J, and Brenner, Steven E
- Subjects
Genetics ,Generic health relevance ,Animals ,Caenorhabditis elegans ,Cluster Analysis ,Computational Biology ,Drosophila melanogaster ,Embryonic Development ,Female ,Gene Expression Profiling ,Gene Expression Regulation ,Developmental ,High-Throughput Nucleotide Sequencing ,Life Cycle Stages ,Male ,Molecular Sequence Annotation ,Organ Specificity ,Transcriptome ,Biological Sciences ,Medical and Health Sciences ,Bioinformatics - Abstract
We report a statistical study to discover transcriptome similarity of developmental stages from D. melanogaster and C. elegans using modENCODE RNA-seq data. We focus on "stage-associated genes" that capture specific transcriptional activities in each stage and use them to map pairwise stages within and between the two species by a hypergeometric test. Within each species, temporally adjacent stages exhibit high transcriptome similarity, as expected. Additionally, fly female adults and worm adults are mapped with fly and worm embryos, respectively, due to maternal gene expression. Between fly and worm, an unexpected strong collinearity is observed in the time course from early embryos to late larvae. Moreover, a second parallel pattern is found between fly prepupae through adults and worm late embryos through adults, consistent with the second large wave of cell proliferation and differentiation in the fly life cycle. The results indicate a partially duplicated developmental program in fly. Our results constitute the first comprehensive comparison between D. melanogaster and C. elegans developmental time courses and provide new insights into similarities in their development . We use an analogous approach to compare tissues and cells from fly and worm. Findings include strong transcriptome similarity of fly cell lines, clustering of fly adult tissues by origin regardless of sex and age, and clustering of worm tissues and dissected cells by developmental stage. Gene ontology analysis supports our results and gives a detailed functional annotation of different stages, tissues and cells. Finally, we show that standard correlation analyses could not effectively detect the mappings found by our method.
- Published
- 2014
50. Pairwise alignment incorporating dipeptide covariation
- Author
-
Crooks, Gavin E., Green, Richard E., and Brenner, Steven E.
- Subjects
Quantitative Biology - Biomolecules ,Quantitative Biology - Populations and Evolution - Abstract
Motivation: Standard algorithms for pairwise protein sequence alignment make the simplifying assumption that amino acid substitutions at neighboring sites are uncorrelated. This assumption allows implementation of fast algorithms for pairwise sequence alignment, but it ignores information that could conceivably increase the power of remote homolog detection. We examine the validity of this assumption by constructing extended substitution matrixes that encapsulate the observed correlations between neighboring sites, by developing an efficient and rigorous algorithm for pairwise protein sequence alignment that incorporates these local substitution correlations, and by assessing the ability of this algorithm to detect remote homologies. Results: Our analysis indicates that local correlations between substitutions are not strong on the average. Furthermore, incorporating local substitution correlations into pairwise alignment did not lead to a statistically significant improvement in remote homology detection. Therefore, the standard assumption that individual residues within protein sequences evolve independently of neighboring positions appears to be an efficient and appropriate approximation.
- Published
- 2005
- Full Text
- View/download PDF
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.