48 results on '"Bakolitsa C"'
Search Results
2. Correlated firing among major ganglion cell types in primate retina
- Author
-
Greschner, M., primary, Shlens, J., additional, Bakolitsa, C., additional, Field, G. D., additional, Gauthier, J. L., additional, Jepson, L. H., additional, Sher, A., additional, Litke, A. M., additional, and Chichilnisky, E. J., additional
- Published
- 2010
- Full Text
- View/download PDF
3. Crystal structure of a cytoskeletal protein
- Author
-
Bakolitsa, C., primary and Liddington, R.C., additional
- Published
- 2004
- Full Text
- View/download PDF
4. Crystal structure of the vinculin tail suggests a pathway for activation
- Author
-
Bakolitsa, C., Jose M de Pereda, Bagshaw, C. R., Critchley, D. R., and Liddington, R. C.
5. TOPSAN: a collaborative annotation environment for structural genomics
- Author
-
Weekes Dana, Krishna S, Bakolitsa Constantina, Wilson Ian A, Godzik Adam, and Wooley John
- Subjects
Computer applications to medicine. Medical informatics ,R858-859.7 ,Biology (General) ,QH301-705.5 - Abstract
Abstract Background Many protein structures determined in high-throughput structural genomics centers, despite their significant novelty and importance, are available only as PDB depositions and are not accompanied by a peer-reviewed manuscript. Because of this they are not accessible by the standard tools of literature searches, remaining underutilized by the broad biological community. Results To address this issue we have developed TOPSAN, The Open Protein Structure Annotation Network, a web-based platform that combines the openness of the wiki model with the quality control of scientific communication. TOPSAN enables research collaborations and scientific dialogue among globally distributed participants, the results of which are reviewed by experts and eventually validated by peer review. The immediate goal of TOPSAN is to harness the combined experience, knowledge, and data from such collaborations in order to enhance the impact of the astonishing number and diversity of structures being determined by structural genomics centers and high-throughput structural biology. Conclusions TOPSAN combines features of automated annotation databases and formal, peer-reviewed scientific research literature, providing an ideal vehicle to bridge a gap between rapidly accumulating data from high-throughput technologies and a much slower pace for its analysis and integration with other, relevant research.
- Published
- 2010
- Full Text
- View/download PDF
6. CAGI SickKids challenges: Assessment of phenotype and variant predictions derived from clinical and genomic data of children with undiagnosed diseases
- Author
-
Zhiqiang Hu, Jesse M. Hunter, Olivier Lichtarge, Sean D. Mooney, Aashish N. Adhikari, Steven E. Brenner, Rita Casadio, Yizhou Yin, Lipika R. Pal, Uma Sunderam, Panagiotis Katsonis, Predrag Radivojac, Thomas Joseph, Giulia Babbi, Naveen Sivadasan, Constantina Bakolitsa, Vangala G. Saipradeep, Laura Kasak, John Moult, Julian Gough, M. Stephen Meyn, Pier Luigi Martelli, Jennifer Poitras, Rupa A Udani, Jan Zaucha, Rafael F. Guerrero, Yuxiang Jiang, Aditya Rao, Sujatha Kotte, Kunal Kundu, Kasak L., Hunter J.M., Udani R., Bakolitsa C., Hu Z., Adhikari A.N., Babbi G., Casadio R., Gough J., Guerrero R.F., Jiang Y., Joseph T., Katsonis P., Kotte S., Kundu K., Lichtarge O., Martelli P.L., Mooney S.D., Moult J., Pal L.R., Poitras J., Radivojac P., Rao A., Sivadasan N., Sunderam U., Saipradeep V.G., Yin Y., Zaucha J., Brenner S.E., and Meyn M.S.
- Subjects
Male ,Adolescent ,In silico ,Genomic data ,Computational biology ,Biology ,Undiagnosed Diseases ,Genome ,Article ,03 medical and health sciences ,Databases, Genetic ,SickKid ,pediatric rare disease ,Genetics ,Humans ,Computer Simulation ,Genetic Predisposition to Disease ,Child ,Gene ,Genetics (clinical) ,030304 developmental biology ,Disease gene ,0303 health sciences ,Whole Genome Sequencing ,variant interpretation ,030305 genetics & heredity ,Computational Biology ,Genetic Variation ,Pathogenicity ,Phenotype ,ddc ,phenotype prediction ,Child, Preschool ,New disease ,CAGI ,Female ,whole-genome sequencing data - Abstract
Whole-genome sequencing (WGS) holds great potential as a diagnostic test. However, the majority of patients currently undergoing WGS lack a molecular diagnosis, largely due to the vast number of undiscovered disease genes and our inability to assess the pathogenicity of most genomic variants. The CAGI SickKids challenges attempted to address this knowledge gap by assessing state-of-the-art methods for clinical phenotype prediction from genomes. CAGI4 and CAGI5 participants were provided with WGS data and clinical descriptions of 25 and 24 undiagnosed patients from the SickKids Genome Clinic Project, respectively. Predictors were asked to identify primary and secondary causal variants. In addition, for CAGI5, groups had to match each genome to one of three disorder categories (neurologic, ophthalmologic, and connective), and separately to each patient. The performance of matching genomes to categories was no better than random but two groups performed significantly better than chance in matching genomes to patients. Two of the ten variants proposed by two groups in CAGI4 were deemed to be diagnostic, and several proposed pathogenic variants in CAGI5 are good candidates for phenotype expansion. We discuss implications for improving in silico assessment of genomic variants and identifying new disease genes.
- Published
- 2019
- Full Text
- View/download PDF
7. Structural basis for viniculin activation at sites of cell adhesion
- Author
-
Bakolitsa, C
- Published
- 2004
- Full Text
- View/download PDF
8. Assessing computational predictions of the phenotypic effect of cystathionine-beta-synthase variants
- Author
-
Ayodeji Olatubosun, Dago F Dimster-Denk, Zhiqiang Hu, Pier Luigi Martelli, Mauno Vihinen, Olivier Lichtarge, Frederic Rousseau, Iddo Friedberg, Castrense Savojardo, Sean D. Mooney, Emanuela Leonardi, Greet De Baets, Manuel Giollo, Jouni Väliaho, Yana Bromberg, Rachel Karchin, Chen Cao, Janita Thusberg, Changhua Yu, Susanna Repo, Rita Casadio, David L. Masica, Laura Kasak, Emidio Capriotti, Jasper Rine, Gaurav Pandey, Silvio C. E. Tosatto, John Moult, Lipika R. Pal, Steven E. Brenner, Predrag Radivojac, Panagiotis Katsonis, Joost Schymkowitz, Joost Van Durme, Constantina Bakolitsa, Kasak L., Bakolitsa C., Hu Z., Yu C., Rine J., Dimster-Denk D.F., Pandey G., De Baets G., Bromberg Y., Cao C., Capriotti E., Casadio R., Van Durme J., Giollo M., Karchin R., Katsonis P., Leonardi E., Lichtarge O., Martelli P.L., Masica D., Mooney S.D., Olatubosun A., Radivojac P., Rousseau F., Pal L.R., Savojardo C., Schymkowitz J., Thusberg J., Tosatto S.C.E., Vihinen M., Valiaho J., Repo S., Moult J., Brenner S.E., and Friedberg I.
- Subjects
Homocysteine ,IMPACT ,ved/biology.organism_classification_rank.species ,Transsulfuration pathway ,chemistry.chemical_compound ,2.1 Biological and endogenous factors ,Single amino acid ,Aetiology ,Precision Medicine ,Genetics (clinical) ,Genetics & Heredity ,PROTEIN FUNCTION ,0303 health sciences ,biology ,030305 genetics & heredity ,CAGI challenge ,SNAP ,Phenotype ,machine learning ,Networking and Information Technology R&D (NITRD) ,phenotype prediction ,critical assessment ,Life Sciences & Biomedicine ,cystathionine-beta-synthase ,ENZYME ,Clinical Sciences ,Cystathionine beta-Synthase ,Homocystinuria ,Computational biology ,single amino acid substitution ,CLASSIFICATION ,Article ,03 medical and health sciences ,Cystathionine ,Genetics ,medicine ,Humans ,Model organism ,030304 developmental biology ,SERVER ,TOOLS ,Science & Technology ,MUTATIONS ,business.industry ,ved/biology ,Computational Biology ,medicine.disease ,Cystathionine beta synthase ,Good Health and Well Being ,chemistry ,Amino Acid Substitution ,biology.protein ,Generic health relevance ,Personalized medicine ,business ,PATHOGENICITY - Abstract
Accurate prediction of the impact of genomic variation on phenotype is a major goal of computational biology and an important contributor to personalized medicine. Computational predictions can lead to a better understanding of the mechanisms underlying genetic diseases, including cancer, but their adoption requires thorough and unbiased assessment. Cystathionine-beta-synthase (CBS) is an enzyme that catalyzes the first step of the transsulfuration pathway, from homocysteine to cystathionine, and in which variations are associated with human hyperhomocysteinemia and homocystinuria. We have created a computational challenge under the CAGI framework to evaluate how well different methods can predict the phenotypic effect(s) of CBS single amino acid substitutions using a blinded experimental data set. CAGI participants were asked to predict yeast growth based on the identity of the mutations. The performance of the methods was evaluated using several metrics. The CBS challenge highlighted the difficulty of predicting the phenotype of an ex vivo system in a model organism when classification models were trained on human disease data. We also discuss the variations in difficulty of prediction for known benign and deleterious variants, as well as identify methodological and experimental constraints with lessons to be learned for future challenges. ispartof: HUMAN MUTATION vol:40 issue:9 pages:1530-1545 ispartof: location:United States status: published
- Published
- 2019
9. Evaluating predictors of kinase activity of STK11 variants identified in primary human non-small cell lung cancers.
- Author
-
Chen Y, Lee K, Woo J, Kim DW, Keum C, Babbi G, Casadio R, Martelli PL, Savojardo C, Manfredi M, Shen Y, Sun Y, Katsonis P, Lichtarge O, Pejaver V, Seward DJ, Kamandula A, Bakolitsa C, Brenner SE, Radivojac P, O'Donnell-Luria A, Mooney SD, and Jain S
- Abstract
Critical evaluation of computational tools for predicting variant effects is important considering their increased use in disease diagnosis and driving molecular discoveries. In the sixth edition of the Critical Assessment of Genome Interpretation (CAGI) challenge, a dataset of 28 STK11 rare variants (27 missense, 1 single amino acid deletion), identified in primary non-small cell lung cancer biopsies, was experimentally assayed to characterize computational methods from four participating teams and five publicly available tools. Predictors demonstrated a high level of performance on key evaluation metrics, measuring correlation with the assay outputs and separating loss-of-function (LoF) variants from wildtype-like (WT-like) variants. The best participant model, 3Cnet, performed competitively with well-known tools. Unique to this challenge was that the functional data was generated with both biological and technical replicates, thus allowing the assessors to realistically establish maximum predictive performance based on experimental variability. Three out of the five publicly available tools and 3Cnet approached the performance of the assay replicates in separating LoF variants from WT-like variants. Surprisingly, REVEL, an often-used model, achieved a comparable correlation with the real-valued assay output as that seen for the experimental replicates. Performing variant interpretation by combining the new functional evidence with computational and population data evidence led to 16 new variants receiving a clinically actionable classification of likely pathogenic (LP) or likely benign (LB). Overall, the STK11 challenge highlights the utility of variant effect predictors in biomedical sciences and provides encouraging results for driving research in the field of computational genome interpretation., Competing Interests: • Conflict of interest/Competing interests None
- Published
- 2024
- Full Text
- View/download PDF
10. Evaluation of enzyme activity predictions for variants of unknown significance in Arylsulfatase A.
- Author
-
Jain S, Trinidad M, Nguyen TB, Jones K, Neto SD, Ge F, Glagovsky A, Jones C, Moran G, Wang B, Rahimi K, Çalıcı SZ, Cedillo LR, Berardelli S, Özden B, Chen K, Katsonis P, Williams A, Lichtarge O, Rana S, Pradhan S, Srinivasan R, Sajeed R, Joshi D, Faraggi E, Jernigan R, Kloczkowski A, Xu J, Song Z, Özkan S, Padilla N, de la Cruz X, Acuna-Hidalgo R, Grafmüller A, Jiménez Barrón LT, Manfredi M, Savojardo C, Babbi G, Martelli PL, Casadio R, Sun Y, Zhu S, Shen Y, Pucci F, Rooman M, Cia G, Raimondi D, Hermans P, Kwee S, Chen E, Astore C, Kamandula A, Pejaver V, Ramola R, Velyunskiy M, Zeiberg D, Mishra R, Sterling T, Goldstein JL, Lugo-Martinez J, Kazi S, Li S, Long K, Brenner SE, Bakolitsa C, Radivojac P, Suhr D, Suhr T, and Clark WT
- Abstract
Continued advances in variant effect prediction are necessary to demonstrate the ability of machine learning methods to accurately determine the clinical impact of variants of unknown significance (VUS). Towards this goal, the ARSA Critical Assessment of Genome Interpretation (CAGI) challenge was designed to characterize progress by utilizing 219 experimentally assayed missense VUS in the Arylsulfatase A ( ARSA ) gene to assess the performance of community-submitted predictions of variant functional effects. The challenge involved 15 teams, and evaluated additional predictions from established and recently released models. Notably, a model developed by participants of a genetics and coding bootcamp, trained with standard machine-learning tools in Python, demonstrated superior performance among submissions. Furthermore, the study observed that state-of-the-art deep learning methods provided small but statistically significant improvement in predictive performance compared to less elaborate techniques. These findings underscore the utility of variant effect prediction, and the potential for models trained with modest resources to accurately classify VUS in genetic and clinical research., Competing Interests: Declarations Conflict of interest/Competing interests Wyatt T. Clark, Marena Trinidad, Courtney Astore, Teague Sterling, and Sufyan Kazi are former employees and potential shareholders of BioMarin Pharmaceutical. Rocio Acuna-Hidalgo is a current employee and shareholder of Nostos Genomics GmbH. Andrea Grafmüller and Laura T. Jiménez Barrón are former employees of Nostos Genomics GmbH.
- Published
- 2024
- Full Text
- View/download PDF
11. Critical assessment of missense variant effect predictors on disease-relevant variant data.
- Author
-
Rastogi R, Chung R, Li S, Li C, Lee K, Woo J, Kim DW, Keum C, Babbi G, Martelli PL, Savojardo C, Casadio R, Chennen K, Weber T, Poch O, Ancien F, Cia G, Pucci F, Raimondi D, Vranken W, Rooman M, Marquet C, Olenyi T, Rost B, Andreoletti G, Kamandula A, Peng Y, Bakolitsa C, Mort M, Cooper DN, Bergquist T, Pejaver V, Liu X, Radivojac P, Brenner SE, and Ioannidis NM
- Abstract
Regular, systematic, and independent assessment of computational tools used to predict the pathogenicity of missense variants is necessary to evaluate their clinical and research utility and suggest directions for future improvement. Here, as part of the sixth edition of the Critical Assessment of Genome Interpretation (CAGI) challenge, we assess missense variant effect predictors (or variant impact predictors) on an evaluation dataset of rare missense variants from disease-relevant databases. Our assessment evaluates predictors submitted to the CAGI6 Annotate-All-Missense challenge, predictors commonly used by the clinical genetics community, and recently developed deep learning methods for variant effect prediction. To explore a variety of settings that are relevant for different clinical and research applications, we assess performance within different subsets of the evaluation data and within high-specificity and high-sensitivity regimes. We find strong performance of many predictors across multiple settings. Meta-predictors tend to outperform their constituent individual predictors; however, several individual predictors have performance similar to that of commonly used meta-predictors. The relative performance of predictors differs in high-specificity and high-sensitivity regimes, suggesting that different methods may be best suited to different use cases. We also characterize two potential sources of bias. Predictors that incorporate allele frequency as a predictive feature tend to have reduced performance when distinguishing pathogenic variants from very rare benign variants, and predictors supervised on pathogenicity labels from curated variant databases often learn label imbalances within genes. Overall, we find notable advances over the oldest and most cited missense variant effect predictors and continued improvements among the most recently developed tools, and the CAGI Annotate-All-Missense challenge (also termed the Missense Marathon) will continue to assess state-of-the-art methods as the field progresses. Together, our results help illuminate the current clinical and research utility of missense variant effect predictors and identify potential areas for future development.
- Published
- 2024
- Full Text
- View/download PDF
12. Critical assessment of variant prioritization methods for rare disease diagnosis within the rare genomes project.
- Author
-
Stenton SL, O'Leary MC, Lemire G, VanNoy GE, DiTroia S, Ganesh VS, Groopman E, O'Heir E, Mangilog B, Osei-Owusu I, Pais LS, Serrano J, Singer-Berk M, Weisburd B, Wilson MW, Austin-Tse C, Abdelhakim M, Althagafi A, Babbi G, Bellazzi R, Bovo S, Carta MG, Casadio R, Coenen PJ, De Paoli F, Floris M, Gajapathy M, Hoehndorf R, Jacobsen JOB, Joseph T, Kamandula A, Katsonis P, Kint C, Lichtarge O, Limongelli I, Lu Y, Magni P, Mamidi TKK, Martelli PL, Mulargia M, Nicora G, Nykamp K, Pejaver V, Peng Y, Pham THC, Podda MS, Rao A, Rizzo E, Saipradeep VG, Savojardo C, Schols P, Shen Y, Sivadasan N, Smedley D, Soru D, Srinivasan R, Sun Y, Sunderam U, Tan W, Tiwari N, Wang X, Wang Y, Williams A, Worthey EA, Yin R, You Y, Zeiberg D, Zucca S, Bakolitsa C, Brenner SE, Fullerton SM, Radivojac P, Rehm HL, and O'Donnell-Luria A
- Subjects
- Humans, Genome, Human genetics, Genetic Variation genetics, Computational Biology methods, Phenotype, Rare Diseases genetics, Rare Diseases diagnosis
- Abstract
Background: A major obstacle faced by families with rare diseases is obtaining a genetic diagnosis. The average "diagnostic odyssey" lasts over five years and causal variants are identified in under 50%, even when capturing variants genome-wide. To aid in the interpretation and prioritization of the vast number of variants detected, computational methods are proliferating. Knowing which tools are most effective remains unclear. To evaluate the performance of computational methods, and to encourage innovation in method development, we designed a Critical Assessment of Genome Interpretation (CAGI) community challenge to place variant prioritization models head-to-head in a real-life clinical diagnostic setting., Methods: We utilized genome sequencing (GS) data from families sequenced in the Rare Genomes Project (RGP), a direct-to-participant research study on the utility of GS for rare disease diagnosis and gene discovery. Challenge predictors were provided with a dataset of variant calls and phenotype terms from 175 RGP individuals (65 families), including 35 solved training set families with causal variants specified, and 30 unlabeled test set families (14 solved, 16 unsolved). We tasked teams to identify causal variants in as many families as possible. Predictors submitted variant predictions with estimated probability of causal relationship (EPCR) values. Model performance was determined by two metrics, a weighted score based on the rank position of causal variants, and the maximum F-measure, based on precision and recall of causal variants across all EPCR values., Results: Sixteen teams submitted predictions from 52 models, some with manual review incorporated. Top performers recalled causal variants in up to 13 of 14 solved families within the top 5 ranked variants. Newly discovered diagnostic variants were returned to two previously unsolved families following confirmatory RNA sequencing, and two novel disease gene candidates were entered into Matchmaker Exchange. In one example, RNA sequencing demonstrated aberrant splicing due to a deep intronic indel in ASNS, identified in trans with a frameshift variant in an unsolved proband with phenotypes consistent with asparagine synthetase deficiency., Conclusions: Model methodology and performance was highly variable. Models weighing call quality, allele frequency, predicted deleteriousness, segregation, and phenotype were effective in identifying causal variants, and models open to phenotype expansion and non-coding variants were able to capture more difficult diagnoses and discover new diagnoses. Overall, computational models can significantly aid variant prioritization. For use in diagnostics, detailed review and conservative assessment of prioritized variants against established criteria is needed., (© 2024. The Author(s).)
- Published
- 2024
- Full Text
- View/download PDF
13. Editorial: Computational and experimental protein variant interpretation in the era of precision medicine.
- Author
-
Sanavia T, Turina P, Morante S, Consalvi V, Lesk AM, Bakolitsa C, and Dell'Orco D
- Abstract
Competing Interests: The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest. The author(s) declared that they were an editorial board member of Frontiers, at the time of submission. This had no impact on the peer review process and the final decision.
- Published
- 2024
- Full Text
- View/download PDF
14. Predicting the impact of rare variants on RNA splicing in CAGI6.
- Author
-
Lord J, Oquendo CJ, Wai HA, Douglas AGL, Bunyan DJ, Wang Y, Hu Z, Zeng Z, Danis D, Katsonis P, Williams A, Lichtarge O, Chang Y, Bagnall RD, Mount SM, Matthiasardottir B, Lin C, Hansen TVO, Leman R, Martins A, Houdayer C, Krieger S, Bakolitsa C, Peng Y, Kamandula A, Radivojac P, and Baralle D
- Abstract
Variants which disrupt splicing are a frequent cause of rare disease that have been under-ascertained clinically. Accurate and efficient methods to predict a variant's impact on splicing are needed to interpret the growing number of variants of unknown significance (VUS) identified by exome and genome sequencing. Here, we present the results of the CAGI6 Splicing VUS challenge, which invited predictions of the splicing impact of 56 variants ascertained clinically and functionally validated to determine splicing impact. The performance of 12 prediction methods, along with SpliceAI and CADD, was compared on the 56 functionally validated variants. The maximum accuracy achieved was 82% from two different approaches, one weighting SpliceAI scores by minor allele frequency, and one applying the recently published Splicing Prediction Pipeline (SPiP). SPiP performed optimally in terms of sensitivity, while an ensemble method combining multiple prediction tools and information from databases exceeded all others for specificity. Several challenge methods equalled or exceeded the performance of SpliceAI, with ultimate choice of prediction method likely to depend on experimental or clinical aims. One quarter of the variants were incorrectly predicted by at least 50% of the methods, highlighting the need for further improvements to splicing prediction methods for successful clinical application., (© 2024. The Author(s).)
- Published
- 2024
- Full Text
- View/download PDF
15. Critical assessment of variant prioritization methods for rare disease diagnosis within the Rare Genomes Project.
- Author
-
Stenton SL, O'Leary M, Lemire G, VanNoy GE, DiTroia S, Ganesh VS, Groopman E, O'Heir E, Mangilog B, Osei-Owusu I, Pais LS, Serrano J, Singer-Berk M, Weisburd B, Wilson M, Austin-Tse C, Abdelhakim M, Althagafi A, Babbi G, Bellazzi R, Bovo S, Carta MG, Casadio R, Coenen PJ, De Paoli F, Floris M, Gajapathy M, Hoehndorf R, Jacobsen JOB, Joseph T, Kamandula A, Katsonis P, Kint C, Lichtarge O, Limongelli I, Lu Y, Magni P, Mamidi TKK, Martelli PL, Mulargia M, Nicora G, Nykamp K, Pejaver V, Peng Y, Pham THC, Podda MS, Rao A, Rizzo E, Saipradeep VG, Savojardo C, Schols P, Shen Y, Sivadasan N, Smedley D, Soru D, Srinivasan R, Sun Y, Sunderam U, Tan W, Tiwari N, Wang X, Wang Y, Williams A, Worthey EA, Yin R, You Y, Zeiberg D, Zucca S, Bakolitsa C, Brenner SE, Fullerton SM, Radivojac P, Rehm HL, and O'Donnell-Luria A
- Abstract
Background: A major obstacle faced by rare disease families is obtaining a genetic diagnosis. The average "diagnostic odyssey" lasts over five years, and causal variants are identified in under 50%. The Rare Genomes Project (RGP) is a direct-to-participant research study on the utility of genome sequencing (GS) for diagnosis and gene discovery. Families are consented for sharing of sequence and phenotype data with researchers, allowing development of a Critical Assessment of Genome Interpretation (CAGI) community challenge, placing variant prioritization models head-to-head in a real-life clinical diagnostic setting., Methods: Predictors were provided a dataset of phenotype terms and variant calls from GS of 175 RGP individuals (65 families), including 35 solved training set families, with causal variants specified, and 30 test set families (14 solved, 16 unsolved). The challenge tasked teams with identifying the causal variants in as many test set families as possible. Ranked variant predictions were submitted with estimated probability of causal relationship (EPCR) values. Model performance was determined by two metrics, a weighted score based on rank position of true positive causal variants and maximum F-measure, based on precision and recall of causal variants across EPCR thresholds., Results: Sixteen teams submitted predictions from 52 models, some with manual review incorporated. Top performing teams recalled the causal variants in up to 13 of 14 solved families by prioritizing high quality variant calls that were rare, predicted deleterious, segregating correctly, and consistent with reported phenotype. In unsolved families, newly discovered diagnostic variants were returned to two families following confirmatory RNA sequencing, and two prioritized novel disease gene candidates were entered into Matchmaker Exchange. In one example, RNA sequencing demonstrated aberrant splicing due to a deep intronic indel in ASNS , identified in trans with a frameshift variant, in an unsolved proband with phenotype overlap with asparagine synthetase deficiency., Conclusions: By objective assessment of variant predictions, we provide insights into current state-of-the-art algorithms and platforms for genome sequencing analysis for rare disease diagnosis and explore areas for future optimization. Identification of diagnostic variants in unsolved families promotes synergy between researchers with clinical and computational expertise as a means of advancing the field of clinical genome interpretation., Competing Interests: Competing interests. Authors S.Z., I.L., E.R., P.M., and R.B., own shares of enGenome srl. Authors F.D.P. and G.N. are employees of enGenome srl. Authors T.J., R.S., S.G.V., N.S., A.R., U.S., N.T., are employees of TCS Ltd. Authors P.J.C., C.K., K.N., and P.S. are employees of Invitae Ltd. H.L.R. receives support from Illumina and Microsoft for rare disease gene discovery and diagnosis. A.O’D-L. is a member of the scientific advisory board for Congenica Inc and the Simons Foundation SPARK for Autism study and co-chairs the clinical advisory board for CAGI. S.E.B receives support at UC Berkeley from a research agreement from TCS. All other authors report no competing interests.
- Published
- 2023
- Full Text
- View/download PDF
16. Assessment of predicted enzymatic activity of α-N-acetylglucosaminidase variants of unknown significance for CAGI 2016.
- Author
-
Clark WT, Kasak L, Bakolitsa C, Hu Z, Andreoletti G, Babbi G, Bromberg Y, Casadio R, Dunbrack R, Folkman L, Ford CT, Jones D, Katsonis P, Kundu K, Lichtarge O, Martelli PL, Mooney SD, Nodzak C, Pal LR, Radivojac P, Savojardo C, Shi X, Zhou Y, Uppal A, Xu Q, Yin Y, Pejaver V, Wang M, Wei L, Moult J, Yu GK, Brenner SE, and LeBowitz JH
- Subjects
- Acetylglucosaminidase genetics, Humans, Models, Genetic, Regression Analysis, Acetylglucosaminidase metabolism, Computational Biology methods, Mutation, Missense
- Abstract
The NAGLU challenge of the fourth edition of the Critical Assessment of Genome Interpretation experiment (CAGI4) in 2016, invited participants to predict the impact of variants of unknown significance (VUS) on the enzymatic activity of the lysosomal hydrolase α-N-acetylglucosaminidase (NAGLU). Deficiencies in NAGLU activity lead to a rare, monogenic, recessive lysosomal storage disorder, Sanfilippo syndrome type B (MPS type IIIB). This challenge attracted 17 submissions from 10 groups. We observed that top models were able to predict the impact of missense mutations on enzymatic activity with Pearson's correlation coefficients of up to .61. We also observed that top methods were significantly more correlated with each other than they were with observed enzymatic activity values, which we believe speaks to the importance of sequence conservation across the different methods. Improved functional predictions on the VUS will help population-scale analysis of disease epidemiology and rare variant association analysis., (© 2019 Wiley Periodicals, Inc.)
- Published
- 2019
- Full Text
- View/download PDF
17. CAGI SickKids challenges: Assessment of phenotype and variant predictions derived from clinical and genomic data of children with undiagnosed diseases.
- Author
-
Kasak L, Hunter JM, Udani R, Bakolitsa C, Hu Z, Adhikari AN, Babbi G, Casadio R, Gough J, Guerrero RF, Jiang Y, Joseph T, Katsonis P, Kotte S, Kundu K, Lichtarge O, Martelli PL, Mooney SD, Moult J, Pal LR, Poitras J, Radivojac P, Rao A, Sivadasan N, Sunderam U, Saipradeep VG, Yin Y, Zaucha J, Brenner SE, and Meyn MS
- Subjects
- Adolescent, Child, Child, Preschool, Computer Simulation, Databases, Genetic, Female, Genetic Predisposition to Disease, Humans, Male, Phenotype, Undiagnosed Diseases genetics, Whole Genome Sequencing, Computational Biology methods, Genetic Variation, Undiagnosed Diseases diagnosis
- Abstract
Whole-genome sequencing (WGS) holds great potential as a diagnostic test. However, the majority of patients currently undergoing WGS lack a molecular diagnosis, largely due to the vast number of undiscovered disease genes and our inability to assess the pathogenicity of most genomic variants. The CAGI SickKids challenges attempted to address this knowledge gap by assessing state-of-the-art methods for clinical phenotype prediction from genomes. CAGI4 and CAGI5 participants were provided with WGS data and clinical descriptions of 25 and 24 undiagnosed patients from the SickKids Genome Clinic Project, respectively. Predictors were asked to identify primary and secondary causal variants. In addition, for CAGI5, groups had to match each genome to one of three disorder categories (neurologic, ophthalmologic, and connective), and separately to each patient. The performance of matching genomes to categories was no better than random but two groups performed significantly better than chance in matching genomes to patients. Two of the ten variants proposed by two groups in CAGI4 were deemed to be diagnostic, and several proposed pathogenic variants in CAGI5 are good candidates for phenotype expansion. We discuss implications for improving in silico assessment of genomic variants and identifying new disease genes., (© 2019 The Authors. Human Mutation published by Wiley Periodicals, Inc.)
- Published
- 2019
- Full Text
- View/download PDF
18. Assessing computational predictions of the phenotypic effect of cystathionine-beta-synthase variants.
- Author
-
Kasak L, Bakolitsa C, Hu Z, Yu C, Rine J, Dimster-Denk DF, Pandey G, De Baets G, Bromberg Y, Cao C, Capriotti E, Casadio R, Van Durme J, Giollo M, Karchin R, Katsonis P, Leonardi E, Lichtarge O, Martelli PL, Masica D, Mooney SD, Olatubosun A, Radivojac P, Rousseau F, Pal LR, Savojardo C, Schymkowitz J, Thusberg J, Tosatto SCE, Vihinen M, Väliaho J, Repo S, Moult J, Brenner SE, and Friedberg I
- Subjects
- Cystathionine metabolism, Cystathionine beta-Synthase metabolism, Homocysteine metabolism, Humans, Phenotype, Precision Medicine, Amino Acid Substitution, Computational Biology methods, Cystathionine beta-Synthase genetics
- Abstract
Accurate prediction of the impact of genomic variation on phenotype is a major goal of computational biology and an important contributor to personalized medicine. Computational predictions can lead to a better understanding of the mechanisms underlying genetic diseases, including cancer, but their adoption requires thorough and unbiased assessment. Cystathionine-beta-synthase (CBS) is an enzyme that catalyzes the first step of the transsulfuration pathway, from homocysteine to cystathionine, and in which variations are associated with human hyperhomocysteinemia and homocystinuria. We have created a computational challenge under the CAGI framework to evaluate how well different methods can predict the phenotypic effect(s) of CBS single amino acid substitutions using a blinded experimental data set. CAGI participants were asked to predict yeast growth based on the identity of the mutations. The performance of the methods was evaluated using several metrics. The CBS challenge highlighted the difficulty of predicting the phenotype of an ex vivo system in a model organism when classification models were trained on human disease data. We also discuss the variations in difficulty of prediction for known benign and deleterious variants, as well as identify methodological and experimental constraints with lessons to be learned for future challenges., (© 2019 Wiley Periodicals, Inc.)
- Published
- 2019
- Full Text
- View/download PDF
19. LUD, a new protein domain associated with lactate utilization.
- Author
-
Hwang WC, Bakolitsa C, Punta M, Coggill PC, Bateman A, Axelrod HL, Rawlings ND, Sedova M, Peterson SN, Eberhardt RY, Aravind L, Pascual J, and Godzik A
- Subjects
- Amino Acid Sequence, Bacterial Proteins chemistry, Bacterial Proteins genetics, Crystallography, X-Ray, Deinococcus genetics, Humans, Microbiota radiation effects, Molecular Sequence Data, Protein Structure, Tertiary, Bacterial Proteins metabolism, Deinococcus chemistry, Deinococcus metabolism, Lactic Acid metabolism
- Abstract
Background: A novel highly conserved protein domain, DUF162 [Pfam: PF02589], can be mapped to two proteins: LutB and LutC. Both proteins are encoded by a highly conserved LutABC operon, which has been implicated in lactate utilization in bacteria. Based on our analysis of its sequence, structure, and recent experimental evidence reported by other groups, we hereby redefine DUF162 as the LUD domain family., Results: JCSG solved the first crystal structure [PDB:2G40] from the LUD domain family: LutC protein, encoded by ORF DR_1909, of Deinococcus radiodurans. LutC shares features with domains in the functionally diverse ISOCOT superfamily. We have observed that the LUD domain has an increased abundance in the human gut microbiome., Conclusions: We propose a model for the substrate and cofactor binding and regulation in LUD domain. The significance of LUD-containing proteins in the human gut microbiome, and the implication of lactate metabolism in the radiation-resistance of Deinococcus radiodurans are discussed.
- Published
- 2013
- Full Text
- View/download PDF
20. TOPSAN: a dynamic web database for structural genomics.
- Author
-
Ellrott K, Zmasek CM, Weekes D, Sri Krishna S, Bakolitsa C, Godzik A, and Wooley J
- Subjects
- Genomics, Proteins chemistry, Proteins genetics, User-Computer Interface, Databases, Protein, Protein Conformation
- Abstract
The Open Protein Structure Annotation Network (TOPSAN) is a web-based collaboration platform for exploring and annotating structures determined by structural genomics efforts. Characterization of those structures presents a challenge since the majority of the proteins themselves have not yet been characterized. Responding to this challenge, the TOPSAN platform facilitates collaborative annotation and investigation via a user-friendly web-based interface pre-populated with automatically generated information. Semantic web technologies expand and enrich TOPSAN's content through links to larger sets of related databases, and thus, enable data integration from disparate sources and data mining via conventional query languages. TOPSAN can be found at http://www.topsan.org.
- Published
- 2011
- Full Text
- View/download PDF
21. Correlated firing among major ganglion cell types in primate retina.
- Author
-
Greschner M, Shlens J, Bakolitsa C, Field GD, Gauthier JL, Jepson LH, Sher A, Litke AM, and Chichilnisky EJ
- Subjects
- Animals, Evoked Potentials, Macaca fascicularis, Macaca mulatta, Photic Stimulation, Synaptic Transmission, Time Factors, Retinal Cone Photoreceptor Cells physiology, Retinal Ganglion Cells physiology, Vision, Ocular, Visual Pathways physiology
- Abstract
Retinal ganglion cells exhibit substantial correlated firing: a tendency to fire nearly synchronously at rates different from those expected by chance. These correlations suggest that network interactions significantly shape the visual signal transmitted from the eye to the brain. This study describes the degree and structure of correlated firing among the major ganglion cell types in primate retina. Correlated firing among ON and OFF parasol, ON and OFF midget, and small bistratified cells, which together constitute roughly 75% of the input to higher visual areas, was studied using large-scale multi-electrode recordings. Correlated firing in the presence of constant, spatially uniform illumination exhibited characteristic strength, time course and polarity within and across cell types. Pairs of nearby cells with the same light response polarity were positively correlated; cells with the opposite polarity were negatively correlated. The strength of correlated firing declined systematically with distance for each cell type, in proportion to the degree of receptive field overlap. The pattern of correlated firing across cell types was similar at photopic and scotopic light levels, although additional slow correlations were present at scotopic light levels. Similar results were also observed in two other retinal ganglion cell types. Most of these observations are consistent with the hypothesis that shared noise from photoreceptors is the dominant cause of correlated firing. Surprisingly, small bistratified cells, which receive ON input from S cones, fired synchronously with ON parasol and midget cells, which receive ON input primarily from L and M cones. Collectively, these results provide an overview of correlated firing across cell types in the primate retina, and constraints on the underlying mechanisms.
- Published
- 2011
- Full Text
- View/download PDF
22. The crystal structure of a bacterial Sufu-like protein defines a novel group of bacterial proteins that are similar to the N-terminal domain of human Sufu.
- Author
-
Das D, Finn RD, Abdubek P, Astakhova T, Axelrod HL, Bakolitsa C, Cai X, Carlton D, Chen C, Chiu HJ, Chiu M, Clayton T, Deller MC, Duan L, Ellrott K, Farr CL, Feuerhelm J, Grant JC, Grzechnik A, Han GW, Jaroszewski L, Jin KK, Klock HE, Knuth MW, Kozbial P, Krishna SS, Kumar A, Lam WW, Marciano D, Miller MD, Morse AT, Nigoghossian E, Nopakun A, Okach L, Puckett C, Reyes R, Tien HJ, Trame CB, van den Bedem H, Weekes D, Wooten T, Xu Q, Yeh A, Zhou J, Hodgson KO, Wooley J, Elsliger MA, Deacon AM, Godzik A, Lesley SA, and Wilson IA
- Subjects
- Amino Acid Sequence, Crystallography, Humans, Models, Molecular, Molecular Sequence Annotation, Molecular Sequence Data, Protein Structure, Tertiary, Reproducibility of Results, Sequence Alignment, Sequence Analysis, Protein, Sequence Homology, Amino Acid, Static Electricity, Structural Homology, Protein, Bacterial Proteins chemistry, Neisseria gonorrhoeae chemistry, Repressor Proteins chemistry
- Abstract
Sufu (Suppressor of Fused), a two-domain protein, plays a critical role in regulating Hedgehog signaling and is conserved from flies to humans. A few bacterial Sufu-like proteins have previously been identified based on sequence similarity to the N-terminal domain of eukaryotic Sufu proteins, but none have been structurally or biochemically characterized and their function in bacteria is unknown. We have determined the crystal structure of a more distantly related Sufu-like homolog, NGO1391 from Neisseria gonorrhoeae, at 1.4 Å resolution, which provides the first biophysical characterization of a bacterial Sufu-like protein. The structure revealed a striking similarity to the N-terminal domain of human Sufu (r.m.s.d. of 2.6 Å over 93% of the NGO1391 protein), despite an extremely low sequence identity of ∼15%. Subsequent sequence analysis revealed that NGO1391 defines a new subset of smaller, Sufu-like proteins that are present in ∼200 bacterial species and has resulted in expansion of the SUFU (PF05076) family in Pfam.
- Published
- 2010
- Full Text
- View/download PDF
23. Structure of LP2179, the first representative of Pfam family PF08866, suggests a new fold with a role in amino-acid metabolism.
- Author
-
Bakolitsa C, Kumar A, Carlton D, Miller MD, Krishna SS, Abdubek P, Astakhova T, Axelrod HL, Chiu HJ, Clayton T, Deller MC, Duan L, Elsliger MA, Feuerhelm J, Grzechnik SK, Grant JC, Han GW, Jaroszewski L, Jin KK, Klock HE, Knuth MW, Kozbial P, Marciano D, McMullan D, Morse AT, Nigoghossian E, Okach L, Oommachen S, Paulsen J, Reyes R, Rife CL, Tien HJ, Trout CV, van den Bedem H, Weekes D, Xu Q, Hodgson KO, Wooley J, Deacon AM, Godzik A, Lesley SA, and Wilson IA
- Subjects
- Amino Acid Sequence, Bacterial Proteins metabolism, Crystallography, X-Ray, Lactobacillus plantarum metabolism, Models, Molecular, Molecular Sequence Data, Protein Structure, Tertiary, Sequence Alignment, Structural Homology, Protein, Amino Acids metabolism, Bacterial Proteins chemistry, Lactobacillus plantarum chemistry, Protein Folding
- Abstract
The structure of LP2179, a member of the PF08866 (DUF1831) family, suggests a novel α+β fold comprising two β-sheets packed against a single helix. A remote structural similarity to two other uncharacterized protein families specific to the Bacillus genus (PF08868 and PF08968), as well as to prokaryotic S-adenosylmethionine decarboxylases, is consistent with a role in amino-acid metabolism. Genomic neighborhood analysis of LP2179 supports this functional assignment, which might also then be extended to PF08868 and PF08968.
- Published
- 2010
- Full Text
- View/download PDF
24. The structure of the first representative of Pfam family PF09836 reveals a two-domain organization and suggests involvement in transcriptional regulation.
- Author
-
Das D, Grishin NV, Kumar A, Carlton D, Bakolitsa C, Miller MD, Abdubek P, Astakhova T, Axelrod HL, Burra P, Chen C, Chiu HJ, Chiu M, Clayton T, Deller MC, Duan L, Ellrott K, Ernst D, Farr CL, Feuerhelm J, Grzechnik A, Grzechnik SK, Grant JC, Han GW, Jaroszewski L, Jin KK, Johnson HA, Klock HE, Knuth MW, Kozbial P, Krishna SS, Marciano D, McMullan D, Morse AT, Nigoghossian E, Nopakun A, Okach L, Oommachen S, Paulsen J, Puckett C, Reyes R, Rife CL, Sefcovic N, Tien HJ, Trame CB, van den Bedem H, Weekes D, Wooten T, Xu Q, Hodgson KO, Wooley J, Elsliger MA, Deacon AM, Godzik A, Lesley SA, and Wilson IA
- Subjects
- Amino Acid Sequence, Bacterial Proteins genetics, Crystallography, X-Ray, Genome, Bacterial, Models, Molecular, Molecular Sequence Data, Neisseria gonorrhoeae genetics, Protein Structure, Quaternary, Protein Structure, Tertiary, Structural Homology, Protein, Bacterial Proteins chemistry, Gene Expression Regulation, Neisseria gonorrhoeae chemistry, Transcription, Genetic
- Abstract
Proteins with the DUF2063 domain constitute a new Pfam family, PF09836. The crystal structure of a member of this family, NGO1945 from Neisseria gonorrhoeae, has been determined and reveals that the N-terminal DUF2063 domain is likely to be a DNA-binding domain. In conjunction with the rest of the protein, NGO1945 is likely to be involved in transcriptional regulation, which is consistent with genomic neighborhood analysis. Of the 216 currently known proteins that contain a DUF2063 domain, the most significant sequence homologs of NGO1945 (∼40-99% sequence identity) are from various Neisseria and Haemophilus species. As these are important human pathogens, NGO1945 represents an interesting candidate for further exploration via biochemical studies and possible therapeutic intervention.
- Published
- 2010
- Full Text
- View/download PDF
25. Structure of the first representative of Pfam family PF04016 (DUF364) reveals enolase and Rossmann-like folds that combine to form a unique active site with a possible role in heavy-metal chelation.
- Author
-
Miller MD, Aravind L, Bakolitsa C, Rife CL, Carlton D, Abdubek P, Astakhova T, Axelrod HL, Chiu HJ, Clayton T, Deller MC, Duan L, Feuerhelm J, Grant JC, Han GW, Jaroszewski L, Jin KK, Klock HE, Knuth MW, Kozbial P, Krishna SS, Kumar A, Marciano D, McMullan D, Morse AT, Nigoghossian E, Okach L, Reyes R, van den Bedem H, Weekes D, Xu Q, Hodgson KO, Wooley J, Elsliger MA, Deacon AM, Godzik A, Lesley SA, and Wilson IA
- Subjects
- Amino Acid Sequence, Bacterial Proteins metabolism, Catalytic Domain, Crystallography, X-Ray, Desulfitobacterium metabolism, Metals, Heavy metabolism, Models, Molecular, Molecular Sequence Data, Protein Binding, Protein Structure, Tertiary, Bacterial Proteins chemistry, Desulfitobacterium chemistry, Metals, Heavy chemistry, Phosphopyruvate Hydratase chemistry, Protein Folding
- Abstract
The crystal structure of Dhaf4260 from Desulfitobacterium hafniense DCB-2 was determined by single-wavelength anomalous diffraction (SAD) to a resolution of 2.01 Å using the semi-automated high-throughput pipeline of the Joint Center for Structural Genomics (JCSG) as part of the NIGMS Protein Structure Initiative (PSI). This protein structure is the first representative of the PF04016 (DUF364) Pfam family and reveals a novel combination of two well known domains (an enolase N-terminal-like fold followed by a Rossmann-like domain). Structural and bioinformatic analyses reveal partial similarities to Rossmann-like methyltransferases, with residues from the enolase-like fold combining to form a unique active site that is likely to be involved in the condensation or hydrolysis of molecules implicated in the synthesis of flavins, pterins or other siderophores. The genome context of Dhaf4260 and homologs additionally supports a role in heavy-metal chelation.
- Published
- 2010
- Full Text
- View/download PDF
26. The structure of KPN03535 (gi|152972051), a novel putative lipoprotein from Klebsiella pneumoniae, reveals an OB-fold.
- Author
-
Das D, Kozbial P, Han GW, Carlton D, Jaroszewski L, Abdubek P, Astakhova T, Axelrod HL, Bakolitsa C, Chen C, Chiu HJ, Chiu M, Clayton T, Deller MC, Duan L, Ellrott K, Elsliger MA, Ernst D, Farr CL, Feuerhelm J, Grzechnik A, Grant JC, Jin KK, Johnson HA, Klock HE, Knuth MW, Krishna SS, Kumar A, Marciano D, McMullan D, Miller MD, Morse AT, Nigoghossian E, Nopakun A, Okach L, Oommachen S, Paulsen J, Puckett C, Reyes R, Rife CL, Sefcovic N, Tien HJ, Trame CB, van den Bedem H, Weekes D, Wooten T, Xu Q, Hodgson KO, Wooley J, Deacon AM, Godzik A, Lesley SA, and Wilson IA
- Subjects
- Amino Acid Sequence, Crystallography, X-Ray, Models, Molecular, Molecular Sequence Data, Protein Folding, Protein Structure, Tertiary, Bacterial Proteins chemistry, Klebsiella pneumoniae chemistry, Lipoproteins chemistry
- Abstract
KPN03535 (gi|152972051) is a putative lipoprotein of unknown function that is secreted by Klebsiella pneumoniae MGH 78578. The crystal structure reveals that despite a lack of any detectable sequence similarity to known structures, it is a novel variant of the OB-fold and structurally similar to the bacterial Cpx-pathway protein NlpE, single-stranded DNA-binding (SSB) proteins and toxins. K. pneumoniae MGH 78578 forms part of the normal human skin, mouth and gut flora and is an opportunistic pathogen that is linked to about 8% of all hospital-acquired infections in the USA. This structure provides the foundation for further investigations into this divergent member of the OB-fold family.
- Published
- 2010
- Full Text
- View/download PDF
27. Structure of the first representative of Pfam family PF09410 (DUF2006) reveals a structural signature of the calycin superfamily that suggests a role in lipid metabolism.
- Author
-
Chiu HJ, Bakolitsa C, Skerra A, Lomize A, Carlton D, Miller MD, Krishna SS, Abdubek P, Astakhova T, Axelrod HL, Clayton T, Deller MC, Duan L, Feuerhelm J, Grant JC, Grzechnik SK, Han GW, Jaroszewski L, Jin KK, Klock HE, Knuth MW, Kozbial P, Kumar A, Marciano D, McMullan D, Morse AT, Nigoghossian E, Okach L, Paulsen J, Reyes R, Rife CL, van den Bedem H, Weekes D, Xu Q, Hodgson KO, Wooley J, Elsliger MA, Deacon AM, Godzik A, Lesley SA, and Wilson IA
- Subjects
- Amino Acid Sequence, Crystallography, X-Ray, Models, Molecular, Molecular Sequence Data, Nitrosomonas europaea metabolism, Oxidative Stress, Protein Structure, Tertiary, Sequence Alignment, Sequence Homology, Amino Acid, Bacterial Proteins chemistry, Databases, Genetic, Lipid Metabolism, Nitrosomonas europaea chemistry
- Abstract
The first structural representative of the domain of unknown function DUF2006 family, also known as Pfam family PF09410, comprises a lipocalin-like fold with domain duplication. The finding of the calycin signature in the N-terminal domain, combined with remote sequence similarity to two other protein families (PF07143 and PF08622) implicated in isoprenoid metabolism and the oxidative stress response, support an involvement in lipid metabolism. Clusters of conserved residues that interact with ligand mimetics suggest that the binding and regulation sites map to the N-terminal domain and to the interdomain interface, respectively.
- Published
- 2010
- Full Text
- View/download PDF
28. The structure of the first representative of Pfam family PF06475 reveals a new fold with possible involvement in glycolipid metabolism.
- Author
-
Bakolitsa C, Kumar A, McMullan D, Krishna SS, Miller MD, Carlton D, Najmanovich R, Abdubek P, Astakhova T, Chiu HJ, Clayton T, Deller MC, Duan L, Elias Y, Feuerhelm J, Grant JC, Grzechnik SK, Han GW, Jaroszewski L, Jin KK, Klock HE, Knuth MW, Kozbial P, Marciano D, Morse AT, Nigoghossian E, Okach L, Oommachen S, Paulsen J, Reyes R, Rife CL, Trout CV, van den Bedem H, Weekes D, White A, Xu Q, Hodgson KO, Wooley J, Elsliger MA, Deacon AM, Godzik A, Lesley SA, and Wilson IA
- Subjects
- Amino Acid Sequence, Bacterial Proteins genetics, Bacterial Proteins metabolism, Crystallography, X-Ray, Genome, Bacterial, Models, Molecular, Molecular Sequence Data, Protein Structure, Quaternary, Protein Structure, Tertiary, Pseudomonas aeruginosa genetics, Pseudomonas aeruginosa metabolism, Bacterial Proteins chemistry, Glycolipids metabolism, Protein Folding, Pseudomonas aeruginosa chemistry
- Abstract
The crystal structure of PA1994 from Pseudomonas aeruginosa, a member of the Pfam PF06475 family classified as a domain of unknown function (DUF1089), reveals a novel fold comprising a 15-stranded β-sheet wrapped around a single α-helix that assembles into a tight dimeric arrangement. The remote structural similarity to lipoprotein localization factors, in addition to the presence of an acidic pocket that is conserved in DUF1089 homologs, phospholipid-binding and sugar-binding proteins, indicate a role for PA1994 and the DUF1089 family in glycolipid metabolism. Genome-context analysis lends further support to the involvement of this family of proteins in glycolipid metabolism and indicates possible activation of DUF1089 homologs under conditions of bacterial cell-wall stress or host-pathogen interactions.
- Published
- 2010
- Full Text
- View/download PDF
29. Structure of BT_3984, a member of the SusD/RagB family of nutrient-binding molecules.
- Author
-
Bakolitsa C, Xu Q, Rife CL, Abdubek P, Astakhova T, Axelrod HL, Carlton D, Chen C, Chiu HJ, Clayton T, Das D, Deller MC, Duan L, Ellrott K, Farr CL, Feuerhelm J, Grant JC, Grzechnik A, Han GW, Jaroszewski L, Jin KK, Klock HE, Knuth MW, Kozbial P, Krishna SS, Kumar A, Lam WW, Marciano D, McMullan D, Miller MD, Morse AT, Nigoghossian E, Nopakun A, Okach L, Puckett C, Reyes R, Tien HJ, Trame CB, van den Bedem H, Weekes D, Hodgson KO, Wooley J, Elsliger MA, Deacon AM, Godzik A, Lesley SA, and Wilson IA
- Subjects
- Amino Acid Sequence, Crystallography, X-Ray, Models, Molecular, Molecular Sequence Data, Protein Structure, Tertiary, Structural Homology, Protein, Bacterial Proteins chemistry, Bacteroides chemistry
- Abstract
The crystal structure of the Bacteroides thetaiotaomicron protein BT_3984 was determined to a resolution of 1.7 Å and was the first structure to be determined from the extensive SusD family of polysaccharide-binding proteins. SusD is an essential component of the sus operon that defines the paradigm for glycan utilization in dominant members of the human gut microbiota. Structural analysis of BT_3984 revealed an N-terminal region containing several tetratricopeptide repeats (TPRs), while the signature C-terminal region is less structured and contains extensive loop regions. Sequence and structure analysis of BT_3984 suggests the presence of binding interfaces for other proteins from the polysaccharide-utilization complex.
- Published
- 2010
- Full Text
- View/download PDF
30. TOPSAN: use of a collaborative environment for annotating, analyzing and disseminating data on JCSG and PSI structures.
- Author
-
Krishna SS, Weekes D, Bakolitsa C, Elsliger MA, Wilson IA, Godzik A, and Wooley J
- Subjects
- Genomics, Humans, Internet, Protein Conformation, Databases, Genetic
- Abstract
The NIH Protein Structure Initiative centers, such as the Joint Center for Structural Genomics (JCSG), have developed highly efficient technological platforms that are capable of experimentally determining the three-dimensional structures of hundreds of proteins per year. However, the overwhelming majority of the almost 5000 protein structures determined by these centers have yet to be described in the peer-reviewed literature. In a high-throughput structural genomics environment, the process of structure determination occurs independently of any associated experimental characterization of function, which creates a challenge for the annotation and analysis of structures and the publication of these results. This challenge has been addressed by developing TOPSAN (`The Open Protein Structure Annotation Network'), which enables the generation of knowledge via collaborations among globally distributed contributors supported by automated amalgamation of available information. TOPSAN currently provides annotations for all protein structures determined by the JCSG in addition to preliminary annotations on a large number of structures from the other PSI production centers. TOPSAN-enabled collaborations have resulted in insightful structure-function analysis for many proteins and have led to numerous peer-reviewed publications, as exemplified by the articles included in this issue of Acta Crystallographica Section F.
- Published
- 2010
- Full Text
- View/download PDF
31. Structure of Bacteroides thetaiotaomicron BT2081 at 2.05 Å resolution: the first structural representative of a new protein family that may play a role in carbohydrate metabolism.
- Author
-
Yeh AP, Abdubek P, Astakhova T, Axelrod HL, Bakolitsa C, Cai X, Carlton D, Chen C, Chiu HJ, Chiu M, Clayton T, Das D, Deller MC, Duan L, Ellrott K, Farr CL, Feuerhelm J, Grant JC, Grzechnik A, Han GW, Jaroszewski L, Jin KK, Klock HE, Knuth MW, Kozbial P, Krishna SS, Kumar A, Lam WW, Marciano D, McMullan D, Miller MD, Morse AT, Nigoghossian E, Nopakun A, Okach L, Puckett C, Reyes R, Tien HJ, Trame CB, van den Bedem H, Weekes D, Wooten T, Xu Q, Hodgson KO, Wooley J, Elsliger MA, Deacon AM, Godzik A, Lesley SA, and Wilson IA
- Subjects
- Amino Acid Sequence, Bacterial Proteins metabolism, Bacteroides metabolism, Binding Sites, Carbohydrates chemistry, Crystallography, X-Ray, Models, Molecular, Molecular Sequence Data, Protein Structure, Tertiary, Sequence Alignment, Structural Homology, Protein, Bacterial Proteins chemistry, Bacteroides chemistry, Carbohydrate Metabolism
- Abstract
BT2081 from Bacteroides thetaiotaomicron (GenBank accession code NP_810994.1) is a member of a novel protein family consisting of over 160 members, most of which are found in the different classes of Bacteroidetes. Genome-context analysis lends support to the involvement of this family in carbohydrate metabolism, which plays a key role in B. thetaiotaomicron as a predominant bacterial symbiont in the human distal gut microbiome. The crystal structure of BT2081 at 2.05 Å resolution represents the first structure from this new protein family. BT2081 consists of an N-terminal domain, which adopts a β-sandwich immunoglobulin-like fold, and a larger C-terminal domain with a β-sandwich jelly-roll fold. Structural analyses reveal that both domains are similar to those found in various carbohydrate-active enzymes. The C-terminal β-jelly-roll domain contains a potential carbohydrate-binding site that is highly conserved among BT2081 homologs and is situated in the same location as the carbohydrate-binding sites that are found in structurally similar glycoside hydrolases (GHs). However, in BT2081 this site is partially occluded by surrounding loops, which results in a deep solvent-accessible pocket rather than a shallower solvent-exposed cleft.
- Published
- 2010
- Full Text
- View/download PDF
32. Structures of the first representatives of Pfam family PF06684 (DUF1185) reveal a novel variant of the Bacillus chorismate mutase fold and suggest a role in amino-acid metabolism.
- Author
-
Bakolitsa C, Kumar A, Jin KK, McMullan D, Krishna SS, Miller MD, Abdubek P, Acosta C, Astakhova T, Axelrod HL, Burra P, Carlton D, Chen C, Chiu HJ, Clayton T, Das D, Deller MC, Duan L, Elias Y, Ellrott K, Ernst D, Farr CL, Feuerhelm J, Grant JC, Grzechnik A, Grzechnik SK, Han GW, Jaroszewski L, Johnson HA, Klock HE, Knuth MW, Kozbial P, Marciano D, Morse AT, Murphy KD, Nigoghossian E, Nopakun A, Okach L, Paulsen J, Puckett C, Reyes R, Rife CL, Sefcovic N, Tien HJ, Trame CB, Trout CV, van den Bedem H, Weekes D, White A, Xu Q, Hodgson KO, Wooley J, Elsliger MA, Deacon AM, Godzik A, Lesley SA, and Wilson IA
- Subjects
- Amino Acid Sequence, Bacillus enzymology, Chorismate Mutase metabolism, Crystallography, X-Ray, Models, Molecular, Molecular Sequence Data, Protein Structure, Quaternary, Protein Structure, Tertiary, Structural Homology, Protein, Amino Acids metabolism, Bordetella bronchiseptica enzymology, Chorismate Mutase chemistry, Protein Folding, Rhodobacteraceae enzymology
- Abstract
The crystal structures of BB2672 and SPO0826 were determined to resolutions of 1.7 and 2.1 Å by single-wavelength anomalous dispersion and multiple-wavelength anomalous dispersion, respectively, using the semi-automated high-throughput pipeline of the Joint Center for Structural Genomics (JCSG) as part of the NIGMS Protein Structure Initiative (PSI). These proteins are the first structural representatives of the PF06684 (DUF1185) Pfam family. Structural analysis revealed that both structures adopt a variant of the Bacillus chorismate mutase fold (BCM). The biological unit of both proteins is a hexamer and analysis of homologs indicates that the oligomer interface residues are highly conserved. The conformation of the critical regions for oligomerization appears to be dependent on pH or salt concentration, suggesting that this protein might be subject to environmental regulation. Structural similarities to BCM and genome-context analysis suggest a function in amino-acid synthesis.
- Published
- 2010
- Full Text
- View/download PDF
33. The structure of BVU2987 from Bacteroides vulgatus reveals a superfamily of bacterial periplasmic proteins with possible inhibitory function.
- Author
-
Das D, Finn RD, Carlton D, Miller MD, Abdubek P, Astakhova T, Axelrod HL, Bakolitsa C, Chen C, Chiu HJ, Chiu M, Clayton T, Deller MC, Duan L, Ellrott K, Ernst D, Farr CL, Feuerhelm J, Grant JC, Grzechnik A, Han GW, Jaroszewski L, Jin KK, Klock HE, Knuth MW, Kozbial P, Krishna SS, Kumar A, Marciano D, McMullan D, Morse AT, Nigoghossian E, Nopakun A, Okach L, Puckett C, Reyes R, Rife CL, Sefcovic N, Tien HJ, Trame CB, van den Bedem H, Weekes D, Wooten T, Xu Q, Hodgson KO, Wooley J, Elsliger MA, Deacon AM, Godzik A, Lesley SA, and Wilson IA
- Subjects
- Amino Acid Sequence, Bacteroides metabolism, Conserved Sequence, Crystallography, X-Ray, Models, Molecular, Molecular Sequence Data, Periplasmic Proteins metabolism, Protein Binding, Protein Structure, Tertiary, Sequence Alignment, Structural Homology, Protein, Bacteroides chemistry, Periplasmic Proteins chemistry
- Abstract
Proteins that contain the DUF2874 domain constitute a new Pfam family PF11396. Members of this family have predominantly been identified in microbes found in the human gut and oral cavity. The crystal structure of one member of this family, BVU2987 from Bacteroides vulgatus, has been determined, revealing a β-lactamase inhibitor protein-like structure with a tandem repeat of domains. Sequence analysis and structural comparisons reveal that BVU2987 and other DUF2874 proteins are related to β-lactamase inhibitor protein, PepSY and SmpA_OmlA proteins and hence are likely to function as inhibitory proteins.
- Published
- 2010
- Full Text
- View/download PDF
34. Structures of three members of Pfam PF02663 (FmdE) implicated in microbial methanogenesis reveal a conserved α+β core domain and an auxiliary C-terminal treble-clef zinc finger.
- Author
-
Axelrod HL, Das D, Abdubek P, Astakhova T, Bakolitsa C, Carlton D, Chen C, Chiu HJ, Clayton T, Deller MC, Duan L, Ellrott K, Farr CL, Feuerhelm J, Grant JC, Grzechnik A, Han GW, Jaroszewski L, Jin KK, Klock HE, Knuth MW, Kozbial P, Krishna SS, Kumar A, Lam WW, Marciano D, McMullan D, Miller MD, Morse AT, Nigoghossian E, Nopakun A, Okach L, Puckett C, Reyes R, Sefcovic N, Tien HJ, Trame CB, van den Bedem H, Weekes D, Wooten T, Xu Q, Hodgson KO, Wooley J, Elsliger MA, Deacon AM, Godzik A, Lesley SA, and Wilson IA
- Subjects
- Amino Acid Sequence, Crystallography, X-Ray, Models, Molecular, Molecular Sequence Data, Protein Structure, Secondary, Protein Structure, Tertiary, Structural Homology, Protein, Aldehyde Oxidoreductases chemistry, Desulfitobacterium enzymology, Methane biosynthesis, Zinc Fingers
- Abstract
Examination of the genomic context for members of the FmdE Pfam family (PF02663), such as the protein encoded by the fmdE gene from the methanogenic archaeon Methanobacterium thermoautotrophicum, indicates that 13 of them are co-transcribed with genes encoding subunits of molybdenum formylmethanofuran dehydrogenase (EC 1.2.99.5), an enzyme that is involved in microbial methane production. Here, the first crystal structures from PF02663 are described, representing two bacterial and one archaeal species: B8FYU2_DESHY from the anaerobic dehalogenating bacterium Desulfitobacterium hafniense DCB-2, Q2LQ23_SYNAS from the syntrophic bacterium Syntrophus aciditrophicus SB and Q9HJ63_THEAC from the thermoacidophilic archaeon Thermoplasma acidophilum. Two of these proteins, Q9HJ63_THEAC and Q2LQ23_SYNAS, contain two domains: an N-terminal thioredoxin-like α+β core domain (NTD) consisting of a five-stranded, mixed β-sheet flanked by several α-helices and a C-terminal zinc-finger domain (CTD). B8FYU2_DESHY, on the other hand, is composed solely of the NTD. The CTD of Q9HJ63_THEAC and Q2LQ23_SYNAS is best characterized as a treble-clef zinc finger. Two significant structural differences between Q9HJ63_THEAC and Q2LQ23_SYNAS involve their metal binding. First, zinc is bound to the putative active site on the NTD of Q9HJ63_THEAC, but is absent from the NTD of Q2LQ23_SYNAS. Second, whereas the structure of the CTD of Q2LQ23_SYNAS shows four Cys side chains within coordination distance of the Zn atom, the structure of Q9HJ63_THEAC is atypical for a treble-cleft zinc finger in that three Cys side chains and an Asp side chain are within coordination distance of the zinc.
- Published
- 2010
- Full Text
- View/download PDF
35. Structure of a membrane-attack complex/perforin (MACPF) family protein from the human gut symbiont Bacteroides thetaiotaomicron.
- Author
-
Xu Q, Abdubek P, Astakhova T, Axelrod HL, Bakolitsa C, Cai X, Carlton D, Chen C, Chiu HJ, Clayton T, Das D, Deller MC, Duan L, Ellrott K, Farr CL, Feuerhelm J, Grant JC, Grzechnik A, Han GW, Jaroszewski L, Jin KK, Klock HE, Knuth MW, Kozbial P, Krishna SS, Kumar A, Lam WW, Marciano D, Miller MD, Morse AT, Nigoghossian E, Nopakun A, Okach L, Puckett C, Reyes R, Tien HJ, Trame CB, van den Bedem H, Weekes D, Wooten T, Yeh A, Zhou J, Hodgson KO, Wooley J, Elsliger MA, Deacon AM, Godzik A, Lesley SA, and Wilson IA
- Subjects
- Amino Acid Sequence, Crystallography, X-Ray, Models, Molecular, Molecular Sequence Data, Protein Structure, Tertiary, Sequence Alignment, Structural Homology, Protein, Bacterial Proteins chemistry, Bacteroides chemistry, Perforin chemistry
- Abstract
Membrane-attack complex/perforin (MACPF) proteins are transmembrane pore-forming proteins that are important in both human immunity and the virulence of pathogens. Bacterial MACPFs are found in diverse bacterial species, including most human gut-associated Bacteroides species. The crystal structure of a bacterial MACPF-domain-containing protein BT_3439 (Bth-MACPF) from B. thetaiotaomicron, a predominant member of the mammalian intestinal microbiota, has been determined. Bth-MACPF contains a membrane-attack complex/perforin (MACPF) domain and two novel C-terminal domains that resemble ribonuclease H and interleukin 8, respectively. The entire protein adopts a flat crescent shape, characteristic of other MACPF proteins, that may be important for oligomerization. This Bth-MACPF structure provides new features and insights not observed in two previous MACPF structures. Genomic context analysis infers that Bth-MACPF may be involved in a novel protein-transport or nutrient-uptake system, suggesting an important role for these MACPF proteins, which were likely to have been inherited from eukaryotes via horizontal gene transfer, in the adaptation of commensal bacteria to the host environment.
- Published
- 2010
- Full Text
- View/download PDF
36. Structures of the first representatives of Pfam family PF06938 (DUF1285) reveal a new fold with repeated structural motifs and possible involvement in signal transduction.
- Author
-
Han GW, Bakolitsa C, Miller MD, Kumar A, Carlton D, Najmanovich RJ, Abdubek P, Astakhova T, Axelrod HL, Chen C, Chiu HJ, Clayton T, Das D, Deller MC, Duan L, Ernst D, Feuerhelm J, Grant JC, Grzechnik A, Jaroszewski L, Jin KK, Johnson HA, Klock HE, Knuth MW, Kozbial P, Krishna SS, Marciano D, McMullan D, Morse AT, Nigoghossian E, Okach L, Reyes R, Rife CL, Sefcovic N, Tien HJ, Trame CB, van den Bedem H, Weekes D, Xu Q, Hodgson KO, Wooley J, Elsliger MA, Deacon AM, Godzik A, Lesley SA, and Wilson IA
- Subjects
- Amino Acid Sequence, Bacterial Proteins genetics, Bacterial Proteins metabolism, Crystallography, X-Ray, Genome, Bacterial, Models, Molecular, Molecular Sequence Data, Protein Structure, Secondary, Protein Structure, Tertiary, Rhodobacteraceae genetics, Rhodobacteraceae metabolism, Shewanella genetics, Shewanella metabolism, Structural Homology, Protein, Bacterial Proteins chemistry, Protein Folding, Rhodobacteraceae chemistry, Shewanella chemistry, Signal Transduction
- Abstract
The crystal structures of SPO0140 and Sbal_2486 were determined using the semiautomated high-throughput pipeline of the Joint Center for Structural Genomics (JCSG) as part of the NIGMS Protein Structure Initiative (PSI). The structures revealed a conserved core with domain duplication and a superficial similarity of the C-terminal domain to pleckstrin homology-like folds. The conservation of the domain interface indicates a potential binding site that is likely to involve a nucleotide-based ligand, with genome-context and gene-fusion analyses additionally supporting a role for this family in signal transduction, possibly during oxidative stress.
- Published
- 2010
- Full Text
- View/download PDF
37. The structure of SSO2064, the first representative of Pfam family PF01796, reveals a novel two-domain zinc-ribbon OB-fold architecture with a potential acyl-CoA-binding role.
- Author
-
Krishna SS, Aravind L, Bakolitsa C, Caruthers J, Carlton D, Miller MD, Abdubek P, Astakhova T, Axelrod HL, Chiu HJ, Clayton T, Deller MC, Duan L, Feuerhelm J, Grant JC, Han GW, Jaroszewski L, Jin KK, Klock HE, Knuth MW, Kumar A, Marciano D, McMullan D, Morse AT, Nigoghossian E, Okach L, Reyes R, Rife CL, van den Bedem H, Weekes D, Xu Q, Hodgson KO, Wooley J, Elsliger MA, Deacon AM, Godzik A, Lesley SA, and Wilson IA
- Subjects
- Amino Acid Sequence, Archaeal Proteins genetics, Archaeal Proteins metabolism, Crystallography, X-Ray, Genome, Archaeal, Models, Molecular, Molecular Sequence Data, Protein Binding, Protein Structure, Tertiary, Sulfolobus solfataricus genetics, Sulfolobus solfataricus metabolism, Acyl Coenzyme A chemistry, Archaeal Proteins chemistry, Protein Folding, Sulfolobus solfataricus chemistry, Zinc chemistry
- Abstract
SSO2064 is the first structural representative of PF01796 (DUF35), a large prokaryotic family with a wide phylogenetic distribution. The structure reveals a novel two-domain architecture comprising an N-terminal, rubredoxin-like, zinc ribbon and a C-terminal, oligonucleotide/oligosaccharide-binding (OB) fold domain. Additional N-terminal helical segments may be involved in protein-protein interactions. Domain architectures, genomic context analysis and functional evidence from certain bacterial representatives of this family suggest that these proteins form a novel fatty-acid-binding component that is involved in the biosynthesis of lipids and polyketide antibiotics and that they possibly function as acyl-CoA-binding proteins. This structure has led to a re-evaluation of the DUF35 family, which has now been split into two entries in the latest Pfam release (v.24.0).
- Published
- 2010
- Full Text
- View/download PDF
38. The structure of Jann_2411 (DUF1470) from Jannaschia sp. at 1.45 Å resolution reveals a new fold (the ABATE domain) and suggests its possible role as a transcription regulator.
- Author
-
Bakolitsa C, Bateman A, Jin KK, McMullan D, Krishna SS, Miller MD, Abdubek P, Acosta C, Astakhova T, Axelrod HL, Burra P, Carlton D, Chiu HJ, Clayton T, Das D, Deller MC, Duan L, Elias Y, Feuerhelm J, Grant JC, Grzechnik A, Grzechnik SK, Han GW, Jaroszewski L, Klock HE, Knuth MW, Kozbial P, Kumar A, Marciano D, Morse AT, Murphy KD, Nigoghossian E, Okach L, Oommachen S, Paulsen J, Reyes R, Rife CL, Sefcovic N, Tien H, Trame CB, Trout CV, van den Bedem H, Weekes D, White A, Xu Q, Hodgson KO, Wooley J, Elsliger MA, Deacon AM, Godzik A, Lesley S, and Wilson IA
- Subjects
- Amino Acid Sequence, Crystallography, X-Ray, Models, Molecular, Molecular Sequence Data, Protein Structure, Quaternary, Protein Structure, Tertiary, Sequence Alignment, Zinc Fingers, Bacterial Proteins chemistry, Rhodobacteraceae chemistry
- Abstract
The crystal structure of Jann_2411 from Jannaschia sp. strain CCS1, a member of the Pfam PF07336 family classified as a domain of unknown function (DUF1470), was solved to a resolution of 1.45 Å by multiple-wavelength anomalous dispersion (MAD). This protein is the first structural representative of the DUF1470 Pfam family. Structural analysis revealed a two-domain organization, with the N-terminal domain presenting a new fold called the ABATE domain that may bind an as yet unknown ligand. The C-terminal domain forms a treble-clef zinc finger that is likely to be involved in DNA binding. Analysis of the Jann_2411 protein and the broader ABATE-domain family suggests a role as stress-induced transcriptional regulators.
- Published
- 2010
- Full Text
- View/download PDF
39. A conserved fold for fimbrial components revealed by the crystal structure of a putative fimbrial assembly protein (BT1062) from Bacteroides thetaiotaomicron at 2.2 Å resolution.
- Author
-
Xu Q, Abdubek P, Astakhova T, Axelrod HL, Bakolitsa C, Cai X, Carlton D, Chen C, Chiu HJ, Chiu M, Clayton T, Das D, Deller MC, Duan L, Ellrott K, Farr CL, Feuerhelm J, Grant JC, Grzechnik A, Han GW, Jaroszewski L, Jin KK, Klock HE, Knuth MW, Kozbial P, Krishna SS, Kumar A, Marciano D, McMullan D, Miller MD, Morse AT, Nigoghossian E, Nopakun A, Okach L, Puckett C, Reyes R, Sefcovic N, Tien HJ, Trame CB, van den Bedem H, Weekes D, Wooten T, Yeh A, Zhou J, Hodgson KO, Wooley J, Elsliger MA, Deacon AM, Godzik A, Lesley SA, and Wilson IA
- Subjects
- Amino Acid Sequence, Bacteroides genetics, Crystallography, X-Ray, Fimbriae Proteins genetics, Models, Molecular, Molecular Sequence Data, Protein Structure, Tertiary, Sequence Alignment, Structural Homology, Protein, Bacteroides chemistry, Fimbriae Proteins chemistry, Fimbriae, Bacterial chemistry, Protein Folding
- Abstract
BT1062 from Bacteroides thetaiotaomicron is a homolog of Mfa2 (PGN0288 or PG0179), which is a component of the minor fimbriae in Porphyromonas gingivalis. The crystal structure of BT1062 revealed a conserved fold that is widely adopted by fimbrial components.
- Published
- 2010
- Full Text
- View/download PDF
40. Structure of the γ-D-glutamyl-L-diamino acid endopeptidase YkfC from Bacillus cereus in complex with L-Ala-γ-D-Glu: insights into substrate recognition by NlpC/P60 cysteine peptidases.
- Author
-
Xu Q, Abdubek P, Astakhova T, Axelrod HL, Bakolitsa C, Cai X, Carlton D, Chen C, Chiu HJ, Chiu M, Clayton T, Das D, Deller MC, Duan L, Ellrott K, Farr CL, Feuerhelm J, Grant JC, Grzechnik A, Han GW, Jaroszewski L, Jin KK, Klock HE, Knuth MW, Kozbial P, Krishna SS, Kumar A, Lam WW, Marciano D, Miller MD, Morse AT, Nigoghossian E, Nopakun A, Okach L, Puckett C, Reyes R, Tien HJ, Trame CB, van den Bedem H, Weekes D, Wooten T, Yeh A, Hodgson KO, Wooley J, Elsliger MA, Deacon AM, Godzik A, Lesley SA, and Wilson IA
- Subjects
- Amino Acid Sequence, Crystallography, X-Ray, Cysteine Proteases genetics, Cysteine Proteases metabolism, Endopeptidases genetics, Endopeptidases metabolism, Genome, Bacterial, Models, Molecular, Molecular Sequence Data, Protein Binding, Protein Structure, Tertiary, Sequence Alignment, Structural Homology, Protein, Substrate Specificity, Bacillus cereus enzymology, Cysteine Proteases chemistry, Endopeptidases chemistry
- Abstract
Dipeptidyl-peptidase VI from Bacillus sphaericus and YkfC from Bacillus subtilis have both previously been characterized as highly specific γ-D-glutamyl-L-diamino acid endopeptidases. The crystal structure of a YkfC ortholog from Bacillus cereus (BcYkfC) at 1.8 Å resolution revealed that it contains two N-terminal bacterial SH3 (SH3b) domains in addition to the C-terminal catalytic NlpC/P60 domain that is ubiquitous in the very large family of cell-wall-related cysteine peptidases. A bound reaction product (L-Ala-γ-D-Glu) enabled the identification of conserved sequence and structural signatures for recognition of L-Ala and γ-D-Glu and, therefore, provides a clear framework for understanding the substrate specificity observed in dipeptidyl-peptidase VI, YkfC and other NlpC/P60 domains in general. The first SH3b domain plays an important role in defining substrate specificity by contributing to the formation of the active site, such that only murein peptides with a free N-terminal alanine are allowed. A conserved tyrosine in the SH3b domain of the YkfC subfamily is correlated with the presence of a conserved acidic residue in the NlpC/P60 domain and both residues interact with the free amine group of the alanine. This structural feature allows the definition of a subfamily of NlpC/P60 enzymes with the same N-terminal substrate requirements, including a previously characterized cyanobacterial L-alanine-γ-D-glutamate endopeptidase that contains the two key components (an NlpC/P60 domain attached to an SH3b domain) for assembly of a YkfC-like active site.
- Published
- 2010
- Full Text
- View/download PDF
41. Bacterial pleckstrin homology domains: a prokaryotic origin for the PH domain.
- Author
-
Xu Q, Bateman A, Finn RD, Abdubek P, Astakhova T, Axelrod HL, Bakolitsa C, Carlton D, Chen C, Chiu HJ, Chiu M, Clayton T, Das D, Deller MC, Duan L, Ellrott K, Ernst D, Farr CL, Feuerhelm J, Grant JC, Grzechnik A, Han GW, Jaroszewski L, Jin KK, Klock HE, Knuth MW, Kozbial P, Krishna SS, Kumar A, Marciano D, McMullan D, Miller MD, Morse AT, Nigoghossian E, Nopakun A, Okach L, Puckett C, Reyes R, Rife CL, Sefcovic N, Tien HJ, Trame CB, van den Bedem H, Weekes D, Wooten T, Hodgson KO, Wooley J, Elsliger MA, Deacon AM, Godzik A, Lesley SA, and Wilson IA
- Subjects
- Amino Acid Sequence, Binding Sites, Conserved Sequence, Crystallography, X-Ray, Eukaryotic Cells, Models, Molecular, Molecular Sequence Data, Protein Binding, Protein Structure, Quaternary, Protein Structure, Secondary, Protein Structure, Tertiary, Sequence Alignment, Surface Properties, Bacteria metabolism, Bacterial Proteins chemistry, Evolution, Molecular, Prokaryotic Cells metabolism, Sequence Homology, Amino Acid
- Abstract
Pleckstrin homology (PH) domains have been identified only in eukaryotic proteins to date. We have determined crystal structures for three members of an uncharacterized protein family (Pfam PF08000), which provide compelling evidence for the existence of PH-like domains in bacteria (PHb). The first two structures contain a single PHb domain that forms a dome-shaped, oligomeric ring with C(5) symmetry. The third structure has an additional helical hairpin attached at the C-terminus and forms a similar but much larger ring with C(12) symmetry. Thus, both molecular assemblies exhibit rare, higher-order, cyclic symmetry but preserve a similar arrangement of their PHb domains, which gives rise to a conserved hydrophilic surface at the intersection of the beta-strands of adjacent protomers that likely mediates protein-protein interactions. As a result of these structures, additional families of PHb domains were identified, suggesting that PH domains are much more widespread than originally anticipated. Thus, rather than being a eukaryotic innovation, the PH domain superfamily appears to have existed before prokaryotes and eukaryotes diverged., (Copyright 2009 Elsevier Ltd. All rights reserved.)
- Published
- 2010
- Full Text
- View/download PDF
42. Structural and functional characterizations of SsgB, a conserved activator of developmental cell division in morphologically complex actinomycetes.
- Author
-
Xu Q, Traag BA, Willemse J, McMullan D, Miller MD, Elsliger MA, Abdubek P, Astakhova T, Axelrod HL, Bakolitsa C, Carlton D, Chen C, Chiu HJ, Chruszcz M, Clayton T, Das D, Deller MC, Duan L, Ellrott K, Ernst D, Farr CL, Feuerhelm J, Grant JC, Grzechnik A, Grzechnik SK, Han GW, Jaroszewski L, Jin KK, Klock HE, Knuth MW, Kozbial P, Krishna SS, Kumar A, Marciano D, Minor W, Mommaas AM, Morse AT, Nigoghossian E, Nopakun A, Okach L, Oommachen S, Paulsen J, Puckett C, Reyes R, Rife CL, Sefcovic N, Tien HJ, Trame CB, van den Bedem H, Wang S, Weekes D, Hodgson KO, Wooley J, Deacon AM, Godzik A, Lesley SA, Wilson IA, and van Wezel GP
- Subjects
- Amino Acid Sequence, Binding Sites, Cell Division, Cryoelectron Microscopy, Crystallography, X-Ray methods, Escherichia coli metabolism, Genetic Complementation Test, Microscopy, Fluorescence methods, Microscopy, Phase-Contrast methods, Molecular Sequence Data, Mutation, Sequence Homology, Amino Acid, Spores, Bacterial, Actinobacteria metabolism, Bacterial Proteins chemistry, Bacterial Proteins physiology
- Abstract
SsgA-like proteins (SALPs) are a family of homologous cell division-related proteins that occur exclusively in morphologically complex actinomycetes. We show that SsgB, a subfamily of SALPs, is the archetypal SALP that is functionally conserved in all sporulating actinomycetes. Sporulation-specific cell division of Streptomyces coelicolor ssgB mutants is restored by introduction of distant ssgB orthologues from other actinomycetes. Interestingly, the number of septa (and spores) of the complemented null mutants is dictated by the specific ssgB orthologue that is expressed. The crystal structure of the SsgB from Thermobifida fusca was determined at 2.6 A resolution and represents the first structure for this family. The structure revealed similarities to a class of eukaryotic "whirly" single-stranded DNA/RNA-binding proteins. However, the electro-negative surface of the SALPs suggests that neither SsgB nor any of the other SALPs are likely to interact with nucleotide substrates. Instead, we show that a conserved hydrophobic surface is likely to be important for SALP function and suggest that proteins are the likely binding partners.
- Published
- 2009
- Full Text
- View/download PDF
43. Exploration of uncharted regions of the protein universe.
- Author
-
Jaroszewski L, Li Z, Krishna SS, Bakolitsa C, Wooley J, Deacon AM, Wilson IA, and Godzik A
- Subjects
- Animals, Databases, Protein, Humans, Models, Molecular, Multigene Family, Protein Structure, Secondary, Protein Structure, Tertiary, Structural Homology, Protein, Time Factors, Proteins chemistry
- Abstract
The genome projects have unearthed an enormous diversity of genes of unknown function that are still awaiting biological and biochemical characterization. These genes, as most others, can be grouped into families based on sequence similarity. The PFAM database currently contains over 2,200 such families, referred to as domains of unknown function (DUF). In a coordinated effort, the four large-scale centers of the NIH Protein Structure Initiative have determined the first three-dimensional structures for more than 250 of these DUF families. Analysis of the first 248 reveals that about two thirds of the DUF families likely represent very divergent branches of already known and well-characterized families, which allows hypotheses to be formulated about their biological function. The remainder can be formally categorized as new folds, although about one third of these show significant substructure similarity to previously characterized folds. These results infer that, despite the enormous increase in the number and the diversity of new genes being uncovered, the fold space of the proteins they encode is gradually becoming saturated. The previously unexplored sectors of the protein universe appear to be primarily shaped by extreme diversification of known protein families, which then enables organisms to evolve new functions and adapt to particular niches and habitats. Notwithstanding, these DUF families still constitute the richest source for discovery of the remaining protein folds and topologies., Competing Interests: The authors have declared that no competing interests exist.
- Published
- 2009
- Full Text
- View/download PDF
44. Structure of the alpha-actinin-vinculin head domain complex determined by cryo-electron microscopy.
- Author
-
Kelly DF, Taylor DW, Bakolitsa C, Bobkov AA, Bankston L, Liddington RC, and Taylor KA
- Subjects
- Actinin genetics, Actinin metabolism, Actins chemistry, Actins metabolism, Animals, Binding Sites, Calorimetry, Differential Scanning, Chickens, Cryoelectron Microscopy, Humans, Integrin beta1 chemistry, Integrin beta1 genetics, Integrin beta1 metabolism, Models, Molecular, Molecular Sequence Data, Multiprotein Complexes, Protein Binding, Talin chemistry, Talin genetics, Talin metabolism, Vinculin genetics, Vinculin metabolism, Actinin chemistry, Protein Conformation, Vinculin chemistry
- Abstract
The vinculin binding site on alpha-actinin was determined by cryo-electron microscopy of 2D arrays formed on phospholipid monolayers doped with a nickel chelating lipid. Chicken smooth muscle alpha-actinin was cocrystallized with the beta1-integrin cytoplasmic domain and a vinculin fragment containing residues 1-258 (vinculin(D1)). Vinculin(D1) was located at a single site on alpha-actinin with 60-70% occupancy. In these arrays, alpha-actinin lacks molecular 2-fold symmetry and the two ends of the molecule, which contain the calmodulin-like and actin binding domains, are held in distinctly different environments. The vinculin(D1) difference density has a shape very suggestive of the atomic structure. The atomic model of the complex juxtaposes the alpha-actinin binding site on vinculin(D1) with the N-terminal lobe of the calmodulin-like domain on alpha-actinin. The results show that the interaction between two species with weak affinity can be visualized in a membrane-like environment.
- Published
- 2006
- Full Text
- View/download PDF
45. Two-wavelength MAD phasing and radiation damage: a case study.
- Author
-
González A, von Delft F, Liddington RC, and Bakolitsa C
- Subjects
- Crystallization, Feasibility Studies, Protein Conformation radiation effects, Software, Crystallography, X-Ray methods, Vinculin chemistry, Vinculin radiation effects
- Abstract
Radiation damage affects MAD experiments in two ways: (i) increased absorption by the crystal at the wavelengths of interest for the experiment results in faster crystal deterioration; (ii) lack of isomorphism induced by radiation damage causes problems when scaling and merging data at different wavelengths and can prevent accurate measurement of anomalous and dispersive differences. In an attempt to overcome these problems in the case of radiation-sensitive crystals of vinculin, two-wavelength MAD data were collected at the Se absorption-edge inflection and at high-energy remote wavelengths. Although this strategy resulted in a lower total absorbed dose compared with a standard three-wavelength experiment using the peak wavelength, an increase in the unit-cell volume and other effects attributable to radiation damage were still observed. In an effort to extract the maximum information available from the data, different data-processing and scaling procedures were compared. Scaling approaches involving local scaling of unmerged reflections were consistently successful and most ordered Se sites could be located. Subsequent use of these sites for phasing resulted in an interpretable electron density map. This case demonstrates the feasibility of two-wavelength MAD in the presence of moderate radiation damage using conventional data collection strategies and widely available standard software.
- Published
- 2005
- Full Text
- View/download PDF
46. Crystal structure of an orphan protein (TM0875) from Thermotoga maritima at 2.00-A resolution reveals a new fold.
- Author
-
Bakolitsa C, Schwarzenbacher R, McMullan D, Brinen LS, Canaves JM, Dai X, Deacon AM, Elsliger MA, Eshagi S, Floyd R, Godzik A, Grittini C, Grzechnik SK, Jaroszewski L, Karlak C, Klock HE, Koesema E, Kovarik JS, Kreusch A, Kuhn P, Lesley SA, McPhillips TM, Miller MD, Morse A, Moy K, Ouyang J, Page R, Quijano K, Robb A, Spraggon G, Stevens RC, van den Bedem H, Velasquez J, Vincent J, von Delft F, Wang X, West B, Wolf G, Hodgson KO, Wooley J, and Wilson IA
- Subjects
- Amino Acid Sequence, Crystallography, X-Ray, Models, Molecular, Molecular Sequence Data, Multiprotein Complexes, Protein Structure, Secondary, Sequence Homology, Structural Homology, Protein, Bacterial Proteins chemistry, Thermotoga maritima chemistry
- Published
- 2004
- Full Text
- View/download PDF
47. Structural basis for vinculin activation at sites of cell adhesion.
- Author
-
Bakolitsa C, Cohen DM, Bankston LA, Bobkov AA, Cadwell GW, Jennings L, Critchley DR, Craig SW, and Liddington RC
- Subjects
- Allosteric Regulation, Animals, Binding Sites, Calorimetry, Differential Scanning, Cell Adhesion, Chickens, Crystallography, X-Ray, Ligands, Models, Molecular, Protein Binding, Protein Structure, Tertiary, Structure-Activity Relationship, Vinculin chemistry, Vinculin metabolism
- Abstract
Vinculin is a highly conserved intracellular protein with a crucial role in the maintenance and regulation of cell adhesion and migration. In the cytosol, vinculin adopts a default autoinhibited conformation. On recruitment to cell-cell and cell-matrix adherens-type junctions, vinculin becomes activated and mediates various protein-protein interactions that regulate the links between F-actin and the cadherin and integrin families of cell-adhesion molecules. Here we describe the crystal structure of the full-length vinculin molecule (1,066 amino acids), which shows a five-domain autoinhibited conformation in which the carboxy-terminal tail domain is held pincer-like by the vinculin head, and ligand binding is regulated both sterically and allosterically. We show that conformational changes in the head, tail and proline-rich domains are linked structurally and thermodynamically, and propose a combinatorial pathway to activation that ensures that vinculin is activated only at sites of cell adhesion when two or more of its binding partners are brought into apposition.
- Published
- 2004
- Full Text
- View/download PDF
48. Crystal structure of the vinculin tail suggests a pathway for activation.
- Author
-
Bakolitsa C, de Pereda JM, Bagshaw CR, Critchley DR, and Liddington RC
- Subjects
- Amino Acid Sequence, Animals, Apolipoproteins metabolism, Chickens, Crystallography, Cytoskeleton chemistry, Models, Biological, Models, Molecular, Molecular Sequence Data, Protein Binding, Protein Structure, Quaternary, Protein Structure, Secondary, Sequence Homology, Amino Acid, Vinculin metabolism, Actins chemistry, Phospholipids metabolism, Vinculin chemistry
- Abstract
Vinculin plays a dynamic role in the assembly of the actin cytoskeleton. A strong interaction between its head and tail domains that regulates binding to other cytoskeletal components is disrupted by acidic phospholipids. Here, we present the crystal structure of the vinculin tail, residues 879-1066. Five amphipathic helices form an antiparallel bundle that resembles exchangeable apolipoproteins. A C-terminal arm wraps across the base of the bundle and emerges as a hydrophobic hairpin surrounded by a collar of basic residues, adjacent to the N terminus. We show that the C-terminal arm is required for binding to acidic phospholipids but not to actin, and that binding either ligand induces conformational changes that may represent the first step in activation.
- Published
- 1999
- Full Text
- View/download PDF
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.