45 results on '"Mark B. Swindells"'
Search Results
2. Inferring function using patterns of native disorder in proteins.
- Author
-
Anna Lobley, Mark B Swindells, Christine A Orengo, and David T Jones
- Subjects
Biology (General) ,QH301-705.5 - Abstract
Natively unstructured regions are a common feature of eukaryotic proteomes. Between 30% and 60% of proteins are predicted to contain long stretches of disordered residues, and not only have many of these regions been confirmed experimentally, but they have also been found to be essential for protein function. In this study, we directly address the potential contribution of protein disorder in predicting protein function using standard Gene Ontology (GO) categories. Initially we analyse the occurrence of protein disorder in the human proteome and report ontology categories that are enriched in disordered proteins. Pattern analysis of the distributions of disordered regions in human sequences demonstrated that the functions of intrinsically disordered proteins are both length- and position-dependent. These dependencies were then encoded in feature vectors to quantify the contribution of disorder in human protein function prediction using Support Vector Machine classifiers. The prediction accuracies of 26 GO categories relating to signalling and molecular recognition are improved using the disorder features. The most significant improvements were observed for kinase, phosphorylation, growth factor, and helicase categories. Furthermore, we provide predicted GO term assignments using these classifiers for a set of unannotated and orphan human proteins. In this study, the importance of capturing protein disorder information and its value in function prediction is demonstrated. The GO category classifiers generated can be used to provide more reliable predictions and further insights into the behaviour of orphan and unannotated proteins.
- Published
- 2007
- Full Text
- View/download PDF
3. abYsis: Integrated Antibody Sequence and Structure-Management, Analysis, and Prediction
- Author
-
Gary Macindoe, Jens H. Nielsen, Matthew Couch, Mark B. Swindells, James Hetherington, Andrew J. Martin, Jacob Hurst, Craig T. Porter, and K.R. Abhinandan
- Subjects
0301 basic medicine ,European Nucleotide Archive ,Sequence analysis ,Interface (Java) ,Biology ,Bioinformatics ,Antibodies ,Set (abstract data type) ,03 medical and health sciences ,Structural Biology ,Animals ,Humans ,Amino Acid Sequence ,Databases, Protein ,Molecular Biology ,Sequence (medicine) ,Structure (mathematical logic) ,Internet ,Information retrieval ,Computational Biology ,computer.file_format ,Protein Data Bank ,Complementarity Determining Regions ,030104 developmental biology ,Key (cryptography) ,computer ,Protein Processing, Post-Translational - Abstract
abYsis is a web-based antibody research system that includes an integrated database of antibody sequence and structure data. The system can be interrogated in numerous ways-from simple text and sequence searches to sophisticated queries that apply 3D structural constraints. The publicly available version includes pre-analyzed sequence data from the European Molecular Biology Laboratory European Nucleotide Archive (EMBL-ENA) and Kabat as well as structure data from the Protein Data Bank. A researcher's own sequences can also be analyzed through the web interface. A defining characteristic of abYsis is that the sequences are automatically numbered with a series of popular schemes such as Kabat and Chothia and then annotated with key information such as complementarity-determining regions and potential post-translational modifications. A unique aspect of abYsis is a set of residue frequency tables for each position in an antibody, allowing "unusual residues" (those rarely seen at a particular position) to be highlighted and decisions to be made on which mutations may be acceptable. This is especially useful when comparing antibodies from different species. abYsis is useful for any researcher specializing in antibody engineering, especially those developing antibodies as drugs. abYsis is available at www.abysis.org.
- Published
- 2016
4. The Complement of Enzymatic Sets in Different Species
- Author
-
Janet M. Thornton, Richard A. George, Shiri Freilich, Ruth V. Spriggs, Mark B. Swindells, and Bissan Al-Lazikani
- Subjects
Proteomics ,Proteome ,Lineage (evolution) ,Computational biology ,Biology ,Genome ,Species Specificity ,Structural Biology ,Animals ,Humans ,KEGG ,Databases, Protein ,Functional group (ecology) ,Molecular Biology ,Phylogeny ,Mammals ,Genetics ,Computational Biology ,Prokaryote ,biology.organism_classification ,Enzymes ,Eukaryotic Cells ,Prokaryotic Cells ,Eukaryote ,Functional divergence ,Function (biology) - Abstract
We present here a comprehensive analysis of the complement of enzymes in a large variety of species. As enzymes are a relatively conserved group there are several classification systems available that are common to all species and link a protein sequence to an enzymatic function. Enzymes are therefore an ideal functional group to study the relationship between sequence expansion, functional divergence and phenotypic changes. By using information retrieved from the well annotated SWISS-PROT database together with sequence information from a variety of fully sequenced genomes and information from the EC functional scheme we have aimed here to estimate the fraction of enzymes in genomes, to determine the extent of their functional redundancy in different domains of life and to identify functional innovations and lineage specific expansions in the metazoa lineage. We found that prokaryote and eukaryote species differ both in the fraction of enzymes in their genomes and in the pattern of expansion of their enzymatic sets. We observe an increase in functional redundancy accompanying an increase in species complexity. A quantitative assessment was performed in order to determine the degree of functional redundancy in different species. Finally, we report a massive expansion in the number of mammalian enzymes involved in signalling and degradation.
- Published
- 2005
- Full Text
- View/download PDF
5. SCOPEC: a database of protein catalytic domains
- Author
-
Ruth V. Spriggs, Mark B. Swindells, Janet M. Thornton, Richard A. George, and Bissan Al-Lazikani
- Subjects
Statistics and Probability ,Protein structure database ,Simple Modular Architecture Research Tool ,Conserved Domain Database ,Protein Data Bank (RCSB PDB) ,Information Storage and Retrieval ,Biology ,computer.software_genre ,Biochemistry ,Catalysis ,Protein structure ,Sequence Analysis, Protein ,Computer Simulation ,Protein function prediction ,Databases, Protein ,Molecular Biology ,Database ,Proteins ,computer.file_format ,Structural Classification of Proteins database ,Protein Data Bank ,Protein Structure, Tertiary ,Computer Science Applications ,Computational Mathematics ,Models, Chemical ,Computational Theory and Mathematics ,Database Management Systems ,Sequence Alignment ,computer - Abstract
Motivation: Domains are the units of protein structure, function and evolution. It is therefore essential to utilize knowledge of domains when studying the evolution of function, or when assigning function to genome sequence data. For this purpose, we have developed a database of catalytic domains, SCOPEC, by combining structural domain information from SCOP, full-length sequence information from Swiss-Prot, and verified functional information from the Enzyme Classification (EC) database. Two major problems need to be overcome to create a database of domain--function relationships; (1) for sequences, EC numbers are typically assigned to whole sequences rather than the functional unit, and (2) The Protein Data Bank (PDB) structures elucidated from a larger multi-domain protein will often have EC annotation although the relevant catalytic domain may lie elsewhere. Results: SCOPEC entries have high quality enzyme assignments; having passed both computational and manual checks. SCOPEC currently contains entries for 75% of all EC annotations in the PDB. Overall, EC number is fairly well conserved within a superfamily, even when the proteins are distantly related. Initial analysis is encouraging; suggesting that there is a 50:50 chance of conserved function in distant homologues first detected by a third iteration PSI-BLAST search. Therefore, we envisage that a knowledge-based approach to function assignment using the domain--EC relationships in SCOPEC will gain a marked improvement over this base line. Availability: The SCOPEC database is a valuable resource in the analysis and prediction of protein structure and function. It can be obtained or queried at our website http://www.enzome.com
- Published
- 2004
- Full Text
- View/download PDF
6. Prioritizing the proteome: identifying pharmaceutically relevant targets
- Author
-
John P. Overington and Mark B. Swindells
- Subjects
Pharmacology ,Prioritization ,Genome ,Proteome ,Drug discovery ,business.industry ,Computational biology ,Biology ,Bioinformatics ,Polymorphism, Single Nucleotide ,ComputingMethodologies_PATTERNRECOGNITION ,Drug Design ,Informatics ,Drug Discovery ,Genomic information ,Pharmaceutical sciences ,business ,Oligonucleotide Array Sequence Analysis ,Pharmaceutical industry - Abstract
Considerable attention is now being placed on prioritizing the proteome as the point of delivery for genomic information. Some of the challenges faced in prioritizing efforts from a pharmaceutical perspective, when presented with an incomplete proteome picture, are described. Examples of pharmaceutically relevant proteins are used to illustrate an informatics-based analysis of the proteome using knowledge of known drug targets. We show how results can be maximized by linking informatics approaches to experimental techniques and describe methods that can be used for prioritization within unprecedented protein families using, for example, single nucleotide polymorphism data and knowledge of disease pathways.
- Published
- 2002
- Full Text
- View/download PDF
7. Getting the most from PSI–BLAST
- Author
-
David T. Jones and Mark B. Swindells
- Subjects
Databases, Factual ,Sequence Homology, Amino Acid ,Molecular Sequence Data ,Proteins ,Protein database ,DNA ,Biology ,Bioinformatics ,Biochemistry ,Data science ,Local sequence ,Sequence homology ,Animals ,Humans ,Amino Acid Sequence ,Sequence Alignment ,Molecular Biology ,Algorithms ,Software - Abstract
Most biologists now conduct sequence searches as a matter of course. But how do we know that a relationship predicted by a homology search is a true, rather than false, hit with the same score? Many biologists design their own experiments with exquisite care yet still assume that results from programs with more than 20 adjustable parameters are 100% reliable. This article explains some of the key steps in getting the most from PSI-Blast, one of the most popular and powerful homology search programs currently available.
- Published
- 2002
- Full Text
- View/download PDF
8. Target-induced conformational adaptation of calmodulin revealed by the crystal structure of a complex with nematode Ca 2+ /calmodulin-dependent kinase kinase peptide 1 1Edited by K. Morikawa
- Author
-
Masatsune Kainosho, Hirofumi Kurokawa, Masanori Osawa, Hiroyuki Kurihara, Mitsuhiko Ikura, Mark B. Swindells, Hiroshi Tokumitsu, and Naoko Katayama
- Subjects
chemistry.chemical_classification ,animal structures ,Myosin light-chain kinase ,Calmodulin ,biology ,EF hand ,Peptide ,Protein structure ,Biochemistry ,chemistry ,Structural Biology ,Ca2+/calmodulin-dependent protein kinase ,Calcium-binding protein ,Biophysics ,biology.protein ,Molecular Biology ,Peptide sequence - Abstract
Calmodulin (CaM) is a ubiquitous calcium (Ca(2+)) sensor which binds and regulates protein serine/threonine kinases along with many other proteins in a Ca(2+)-dependent manner. For this multi-functionality, conformational plasticity is essential; however, the nature and magnitude of CaM's plasticity still remains largely undetermined. Here, we present the 1.8 A resolution crystal structure of Ca(2+)/CaM, complexed with the 27-residue synthetic peptide corresponding to the CaM-binding domain of the nematode Caenorhabditis elegans Ca(2+)/CaM-dependent kinase kinase (CaMKK). The peptide bound in this crystal structure is a homologue of the previously NMR-derived complex with rat CaMKK, but benefits from improved structural resolution. Careful comparison of the present structure to previous crystal structures of CaM complexed with unrelated peptides derived from myosin light chain kinase and CaM kinase II, allow a quantitative analysis of the differences in the relative orientation of the N and C-terminal domains of CaM, defined as a screw axis rotation angle ranging from 156 degrees to 196 degrees. The principal differences in CaM interaction with various peptides are associated with the N-terminal domain of CaM. Unlike the C-terminal domain, which remains unchanged internally, the N-terminal domain of CaM displays significant differences in the EF-hand helix orientation between this and other CaM structures. Three hydrogen bonds between CaM and the peptide (E87-R336, E87-T339 and K75-T339) along with two salt bridges (E11-R349 and E114-K334) are the most probable determinants for the binding direction of the CaMKK peptide to CaM.
- Published
- 2001
- Full Text
- View/download PDF
9. Using the CATH domain database to assign structures and functions to the genome sequences
- Author
-
Mark B. Swindells, Annabel E. Todd, A. A. Salamov, Andrew J. Martin, Frances M. G. Pearl, Janet M. Thornton, CA Orengo, James E. Bray, and M. Suwa
- Subjects
Smith–Waterman algorithm ,Genome ,Databases, Factual ,Microbial Genomes ,Database ,Protein Conformation ,Structural alignment ,Structural Classification of Proteins database ,Biology ,computer.software_genre ,Biochemistry ,Homology (biology) ,Protein Structure, Tertiary ,Structure-Activity Relationship ,Protein structure ,GenBank ,computer ,Algorithms - Abstract
The CATH database of protein structures contains ∼ 18000 domains organized according to their (C)lass, (A)rchitecture, (T)opology and (H)omologous superfamily [1]. Relationships between evolutionary related structures (homologues) within the database have been used to test the sensitivity of various sequence search methods in order to identify relatives in Genbank and other sequence databases [2]. Subsequent application of the most sensitive and efficient algorithms, gapped blast and the profile based method, Position Specific Iterated Basic Local Alignment Tool (PSI-BLAST) [3], could be used to assign structural data to between 22 and 36% of microbial genomes in order to improve functional annotation and enhance understanding of biological mechanism. However, on a cautionary note, an analysis of functional conservation within fold groups and homologous superfamilies in the CATH database, revealed that whilst function was conserved in nearly 55% of enzyme families, function had diverged considerably, in some highly populated families. In these families, functional properties should be inherited far more cautiously and the probable effects of substitutions in key functional residues carefully assessed.
- Published
- 2000
- Full Text
- View/download PDF
10. Diversity of conformational states and changes within the EF-hand protein superfamily
- Author
-
James B. Ames, Mitsuhiko Ikura, Mark B. Swindells, and Kyoko L. Yap
- Subjects
Conformational change ,Calmodulin ,EF hand ,Protein superfamily ,Biology ,Biochemistry ,DNA-binding protein ,Troponin C ,Structural Biology ,Recoverin ,Biophysics ,biology.protein ,Molecular Biology ,Binding domain - Abstract
The EF-hand motif, which assumes a helix-loop-helix structure normally responsible for Ca2+ binding, is found in a large number of functionally diverse Ca2+ binding proteins collectively known as the EF-hand protein superfamily. In many superfamily members, Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. In calmodulin and troponin C, this is described as a change from the closed conformational state in the absence of Ca2+ to the open conformational state in its presence. It is now clear from structures of other EF-hand proteins that this "closed-to-open" conformational transition is not the sole model for EF-hand protein structural response to Ca2+. More complex modes of conformational change are observed in EF-hand proteins that interact with a covalently attached acyl group (e.g., recoverin) and in those that dimerize (e.g., S100B, calpain). In fact, EF-hand proteins display a multitude of unique conformational states, together constituting a conformational continuum. Using a quantitative 3D approach termed vector geometry mapping (VGM), we discuss this tertiary structural diversity of EF-hand proteins and its correlation with target recognition.
- Published
- 1999
- Full Text
- View/download PDF
11. Combining sensitive database searches with multiple intermediates to detect distant homologues
- Author
-
Christine A. Orengo, Makiko Suwa, Mark B. Swindells, and Asaf Salamov
- Subjects
Normalization (statistics) ,Models, Statistical ,Databases, Factual ,Sequence Homology, Amino Acid ,Database ,Protein Conformation ,Sequence analysis ,Bioengineering ,Biology ,computer.software_genre ,Sensitivity and Specificity ,Biochemistry ,Sequence search ,Computer Simulation ,Threading (protein sequence) ,Sequence Alignment ,Molecular Biology ,computer ,Biotechnology - Abstract
Using data from the CATH structure classification, we have assessed the blastp, fasta, smith-waterman and gapped-blast algorithms, developed a portable normalization scheme and identified safe thresholds for database searching. Of the four methods assessed, fasta, smith-waterman and gapped-blast perform similarly, whereas the sensitivity of blastp was much lower. Introduction of an intermediate sequence search substantially improved the results. When tested on a set of relationships that could not be identified by blastp, intermediate sequences were able to find double the number of relationships identified by the smith-waterman algorithm alone. However, we found that the benefit of using intermediates varied considerably between each family and depended not only on the number of available sequences, but also their diversity. In an attempt to increase sensitivity further, a multiple intermediate sequence search (MISS) procedure was developed. When assessed on 1906 cases from a wide range of homologous families that could not be detected by the previous approaches, MISS was able to identify 241 additional relationships. MISS uses the full extent of sequence diversity to detect additional relationships, but does not consider any structure-specific information. For this reason, it is more generally applicable than fold recognition and threading methods, which require a library of known structures.
- Published
- 1999
- Full Text
- View/download PDF
12. Contemporary approaches to protein structure classification
- Author
-
CA Orengo, Mark B. Swindells, Janet M. Thornton, David T. Jones, and E G Hutchinson
- Subjects
Structure (mathematical logic) ,Protein structure database ,Identification (information) ,Protein structure ,Sequence database ,Sequence analysis ,Structural alignment ,Computational biology ,Biology ,Bioinformatics ,Structure comparison ,General Biochemistry, Genetics and Molecular Biology - Abstract
In a similar manner to sequence database searching, it is also possible to compare three-dimensional protein structure. Such methods can be extremely useful because a structural similarity may represent a distant evolutionary relationship that is undetectable by sequence analysis. In this review, we summarise the most popular structure comparison methods, show how they can be used for database searching, and then describe some of the most advanced attempts to develop comprehensive protein structure classifications. With such data, it is possible to identify distant evolutionary relationships, provide libraries of unique folds for structure prediction, estimate the total number of folds that exist, and investigate the preference for certain types of structures over others.
- Published
- 1998
- Full Text
- View/download PDF
13. NMR structure of the histidine kinase domain of the E. coli osmosensor EnvZ
- Author
-
Toshimasa Yamazaki, Masayori Inouye, Akira Ono, Ling Qin, Dingjiang Liu, Mitsuhiko Ikura, Masatsune Kainosho, Mark B. Swindells, Kit I. Tong, Chieri Tomomori, Heiyoung Park, Rinku Dutta, Soumitra K. Saha, Rieko Ishima, and Toshiyuki Tanaka
- Subjects
Models, Molecular ,Magnetic Resonance Spectroscopy ,Multidisciplinary ,Histidine Kinase ,Protein Conformation ,Chemistry ,Escherichia coli Proteins ,Molecular Sequence Data ,Histidine kinase ,Autophosphorylation ,Crystallography, X-Ray ,Recombinant Proteins ,Transmembrane protein ,Two-component regulatory system ,Response regulator ,Biochemistry ,Multienzyme Complexes ,Catalytic Domain ,Escherichia coli ,Phosphorylation ,Amino Acid Sequence ,Protein kinase A ,Protein Kinases ,Histidine ,Bacterial Outer Membrane Proteins - Abstract
Bacteria live in capricious environments, in which they must continuously sense external conditions in order to adjust their shape, motility and physiology1. The histidine–aspartate phosphorelay signal-transduction system (also known as the two-component system) is important in cellular adaptation to environmental changes in both prokaryotes and lower eukaryotes2,3. In this system, protein histidine kinases function as sensors and signal transducers. The Escherichia coli osmosensor, EnvZ, is a transmembrane protein with histidine kinase activity in its cytoplasmic region2. The cytoplasmic region contains two functional domains4: domain A (residues 223–289) contains the conserved histidine residue (H243), a site of autophosphorylation as well as transphosphorylation to the conserved D55 residue of response regulator OmpR, whereas domain B (residues 290–450) encloses several highly conserved regions (G1, G2, F and N boxes) and is able to phosphorylate H243. Here we present the solution structure of domain B, the catalytic core of EnvZ. This core has a novel protein kinase structure, distinct from the serine/threonine/tyrosine kinase fold, with unanticipated similarities to both heat-shock protein 90 and DNA gyrase B.
- Published
- 1998
- Full Text
- View/download PDF
14. NMR structure of the Streptomyces metalloproteinase inhibitor, SMPI, isolated from Streptomyces nigrescens TK-23: another example of an ancestral βγ-crystallin precursor structure 1 1Edited by P. E. Wright
- Author
-
Sailaja S. Seeram, Masatsune Kainosho, Ayako Ohno, Shin-ichi Tate, Kazumi Hiraga, Kohei Oda, and Mark B. Swindells
- Subjects
biology ,Structural similarity ,Chemistry ,Dihedral angle ,biology.organism_classification ,Antiparallel (biochemistry) ,Streptomyces ,Crystallography ,Protein structure ,Structural Biology ,Thermolysin ,Crystallin ,Molecular Biology ,Streptomyces nigrescens - Abstract
The Streptomyces metalloproteinase inhibitor, SMPI, isolated from Streptomyces nigrescens TK-23, is a proteinaceous metalloproteinase inhibitor, and consists of 102 amino acid residues with two disulfide bridges. SMPI specifically inhibits metalloproteinases such as thermolysin. In the present work, the solution structure of SMPI was determined on the basis of 1536 nuclear Overhauser enhancement derived distance restraints and 52 dihedral angle restraints obtained from three-bond spin coupling constants. The final ensemble of 20 NMR structures overlaid onto their mean coordinate with backbone (N, Cα, C′) r.m.s.d. values of 0.45(±0.11) A and 0.57(±0.18) A for residues 6 to 99 and the entire 102 residues, respectively. SMPI is essentially composed of two β-sheets, each consisting of four antiparallel β-strands. The structure can be considered as two Greek key motifs with 2-fold internal symmetry, a Greek key β-barrel. One unique structural feature found in SMPI is in its extension between the first and second strands of the second Greek key motif. Interestingly, this extended segment is known to be involved in the inhibitory activity of SMPI. In the absence of sequence similarity, the SMPI structure shows clear similarity to both domains of the eye lens crystallins, both domains of the calcium sensor protein-S, as well as the single-domain yeast killer toxin. The yeast killer toxin structure was thought to be a precursor of the two-domain βγ-crystallin proteins, because of its structural similarity to each domain of the βγ-crystallins. SMPI thus provides another example of a single-domain protein structure that corresponds to the ancestral fold from which the two-domain proteins in the βγ-crystallin superfamily are believed to have evolved.
- Published
- 1998
- Full Text
- View/download PDF
15. Assessing protein coding region integrity in cDNA sequencing projects
- Author
-
Asaf Salamov, Tetsuo Nishikawa, and Mark B. Swindells
- Subjects
Statistics and Probability ,DNA, Complementary ,Databases, Factual ,Codon, Initiator ,Biology ,Biochemistry ,DNA sequencing ,Set (abstract data type) ,Open Reading Frames ,Start codon ,Complementary DNA ,Humans ,RNA, Messenger ,Molecular Biology ,Sequence (medicine) ,Genetics ,Messenger RNA ,Base Sequence ,Computational Biology ,Proteins ,Sequence Analysis, DNA ,Linear discriminant analysis ,Base (topology) ,Computer Science Applications ,Computational Mathematics ,Computational Theory and Mathematics - Abstract
MOTIVATION: In cDNA sequencing projects, it is vital to know whether the protein coding region of a sequence is complete, or whether errors have occurred during library construction. Here we present a linear discriminant approach that predicts this completeness by estimating the probability of each ATG being the initiation codon. RESULTS: Because of the current shortage of full-length cDNA data on which to base this work, tests were performed on a non-redundant set of 660 initiation codon-containing DNA sequences that had been conceptually spliced into mRNA/cDNA. We also used an edited set of the same sequences that only contained the region following the initiation codon as a negative control. Using the criterion that only a single prediction is allowed for each sequence, a cut-off was selected at which discrimination of both positive and negative sets was equal. At this cut-off, 67% of each set could be correctly distinguished, with the correct ATG codon also being identified in the positive set. Reliability could be increased further by raising the cut-off or including homologues, the relative merits of which are discussed. AVAILABILITY: The prediction program, called ATGpr, and other data are available at http://www.hri.co.jp/atgpr CONTACT: swintech@hri.co.jp
- Published
- 1998
- Full Text
- View/download PDF
16. Domain assignment for protein structures using a consensus approach: Characterization and analysis
- Author
-
Michael Stewart, Chirstine Orengo, Susan Jones, Mark B. Swindells, Janet M. Thornton, and A.D. Michie
- Subjects
Class (set theory) ,Computer science ,computer.file_format ,Characterization (mathematics) ,Bioinformatics ,Protein Data Bank ,Biochemistry ,Domain (software engineering) ,Protein chain ,Data set ,Protein structure ,Molecular Biology ,Protein secondary structure ,computer ,Algorithm - Abstract
A consensus approach for the assignment of structural domains in proteins is presented. The approach combines a number of previously published algorithms, and takes advantage of the elevated accuracy obtained when assignments from the individual algorithms are in agreement. The consensus approach is tested on a data set of 55 protein chains, for which domain assignments from four automated methods were known, and for which crystallographers assignments had been reported in the literature. Accuracy was found to increase in this test from 72% using individual algorithms to 100% when all four methods were in agreement. However a consensus prediction using all four methods was only possible for 52% of the dataset. The consensus approach [using three publicly available domain assignment algorithms (PUU, DETECTIVE, DOMAK)] was then used to make domain assignments for a data set of 787 protein chains from the Protein Data Bank. Analysis of the assignments showed 55.7% of assignments could be made automatically, and of these, 13.5% were multi-domain proteins. Of the remaining 44.3% that could not be assigned by the consensus procedure 90.4% had their domain boundaries assigned correctly by at least one of the algorithms. Once identified, these domains were analyzed for trends in their size and secondary structure class. In addition, the discontinuity of each domain along the protein chain was considered.
- Published
- 1998
- Full Text
- View/download PDF
17. Solution structure of Calmodulin-W-7 complex: the basis of diversity in molecular recognition
- Author
-
Jun Tanikawa, Mitsuhiko Ikura, Toshio Furuya, Masanori Osawa, Toshiyuki Tanaka, Toshiyasu Mase, and Mark B. Swindells
- Subjects
Models, Molecular ,Magnetic Resonance Spectroscopy ,Myosin light-chain kinase ,Calmodulin ,Macromolecular Substances ,Protein Conformation ,Stereochemistry ,Recombinant Fusion Proteins ,Ring (chemistry) ,Xenopus laevis ,Molecular recognition ,Structural Biology ,Animals ,Molecular Biology ,Indole test ,Sulfonamides ,Binding Sites ,biology ,Chemistry ,Nuclear magnetic resonance spectroscopy ,Solutions ,biology.protein ,Two-dimensional nuclear magnetic resonance spectroscopy ,Heteronuclear single quantum coherence spectroscopy ,Protein Binding - Abstract
The solution structure of calcium-bound calmodulin (CaM) complexed with an antagonist, N-(6-aminohexyl)-5-chloro-1-naphthalenesulfonamide (W-7), has been determined by multidimensional NMR spectroscopy. The structure consists of one molecule of W-7 binding to each of the two domains of CaM. In each domain, the W-7 chloronaphthalene ring interacts with four methionine methyl groups and other aliphatic or aromatic side-chains in a deep hydrophobic pocket, the site responsible for CaM binding to CaM-dependent enzymes such as myosin light chain kinases (MLCKs) and CaM kinase II. This competitive binding at the same site between W-7 and CaM-dependent enzymes suggests the mechanism by which W-7 inhibits CaM to activate the enzymes. The orientation of the W-7 naphthalene ring in the N-terminal pocket is rotated approximately 40 degrees with respect to that in the C-terminal pocket. The W-7 ring orientation differs significantly from the Trp800 indole ring of smooth muscle MLCK bound to the C-terminal pocket and the phenothiazine ring of trifluoperazine bound to the N or C-terminal pocket. These comparative structural analyses demonstrate that the two hydrophobic pockets of CaM can accommodate a variety of bulky aromatic rings, which provides a plausible structural basis for the diversity in CaM-mediated molecular recognition.
- Published
- 1998
- Full Text
- View/download PDF
18. Pre-formation of the semi-open conformation by the apo-calmodulin C-terminal domain and implications for binding IQ-motifs
- Author
-
Mark B. Swindells and Mitsuhiko Ikura
- Subjects
Models, Molecular ,Binding Sites ,Calmodulin ,biology ,Protein Conformation ,Stereochemistry ,Chemistry ,C-terminus ,Molecular Sequence Data ,Conserved sequence ,Protein structure ,Structural Biology ,Myosin ,biology.protein ,Calcium ,Amino Acid Sequence ,sense organs ,Binding site ,Molecular Biology ,Peptide sequence ,Conserved Sequence ,Binding domain - Abstract
Unanticipated similarities between apo-calmodulin and the myosin essential light chain suggest structural characteristics responsible lor IQ-motif binding as well as calcium induced conformational changes.
- Published
- 1996
- Full Text
- View/download PDF
19. A procedure for detecting structural domains in proteins
- Author
-
Mark B. Swindells
- Subjects
Sequence ,Protein structure ,Escherichia coli Proteins ,Pepsin A ,Biochemistry ,Interface (Java) ,Simple (abstract algebra) ,Biology ,Molecular Biology ,Algorithm ,Peptide sequence ,Domain (software engineering) - Abstract
A procedure is described for detecting domains in proteins of known structure. The method is based on the intuitively simple idea that each domain should contain an identifiable hydrophobic core. By applying the algorithm described in the companion paper (Swindells MB, 1995, Protein Sci 4:93-102) to identify distinct cores in multi-domain proteins, one can use this information to determine both the number and the location of the constituent domains. Tests have shown the procedure to be effective on a number of examples, even when the domains are discontinuous along the sequence. However, deficiencies also occur when hydrophobic cores from different domains continue through the interface region and join one another.
- Published
- 1995
- Full Text
- View/download PDF
20. LigPlot+: multiple ligand-protein interaction diagrams for drug discovery
- Author
-
Roman A. Laskowski and Mark B. Swindells
- Subjects
Models, Molecular ,Protein Conformation ,General Chemical Engineering ,Molecular Sequence Data ,Computational biology ,Library and Information Sciences ,Ligands ,Plot (graphics) ,Drug Discovery ,Humans ,Amino Acid Sequence ,Databases, Protein ,Binding Sites ,Sequence Homology, Amino Acid ,Chemistry ,Drug discovery ,Proteins ,General Chemistry ,Protein superfamily ,Ligand (biochemistry) ,Small molecule ,Computer Science Applications ,Crystallography ,Protein target ,3d coordinates ,Software ,Protein Binding - Abstract
We describe a graphical system for automatically generating multiple 2D diagrams of ligand–protein interactions from 3D coordinates. The diagrams portray the hydrogen-bond interaction patterns and hydrophobic contacts between the ligand(s) and the main-chain or side-chain elements of the protein. The system is able to plot, in the same orientation, related sets of ligand–protein interactions. This facilitates popular research tasks, such as analyzing a series of small molecules binding to the same protein target, a single ligand binding to homologous proteins, or the completely general case where both protein and ligand change.
- Published
- 2011
21. Modelling by homology
- Author
-
Mark B. Swindells and Janet M. Thornton
- Subjects
Structural Biology ,A protein ,Computational biology ,Biology ,Bioinformatics ,Molecular Biology ,Homology (biology) - Abstract
Modelling on the basis of a homologous structure is the only reliable method available to predict the three-dimensional structure of a protein from its sequence. The past year has seen considerable advances in both the development of automated procedures and their application to proteins of outstanding biological interest.
- Published
- 1991
- Full Text
- View/download PDF
22. Identification of a common fold in the replication terminator protein suggests a possible mode for DNA binding
- Author
-
Mark B. Swindells
- Subjects
DNA Replication ,Models, Molecular ,Genetics ,Sequence Homology, Amino Acid ,Molecular Sequence Data ,Serine Endopeptidases ,Nuclear Proteins ,Forkhead Transcription Factors ,Biology ,Biochemistry ,Protein Structure, Secondary ,Cell biology ,DNA-Binding Proteins ,Histones ,chemistry.chemical_compound ,Terminator (genetics) ,Bacterial Proteins ,chemistry ,Amino Acid Sequence ,Molecular Biology ,DNA ,Bacillus subtilis ,Protein Binding ,Transcription Factors - Published
- 1995
- Full Text
- View/download PDF
23. Vector Geometry Mapping: A Method to Characterize the Conformation of Helix-Loop-Helix Calcium-Binding Proteins
- Author
-
Kyoko L. Yap, Mitsuhiko Ikura, Mark B. Swindells, and James B. Ames
- Subjects
Basic helix-loop-helix ,Chemistry ,Calcium-binding protein ,Biophysics ,Molecular biology ,Euclidean vector - Published
- 2003
- Full Text
- View/download PDF
24. Finding Your fold
- Author
-
Mark B. Swindells
- Subjects
Models, Molecular ,Binding Sites ,Fold (higher-order function) ,Chemistry ,Stereochemistry ,Mineralogy ,Bioengineering ,Biochemistry ,Protein Structure, Secondary ,Protein Structure, Tertiary ,Structure-Activity Relationship ,Sequence Alignment ,Molecular Biology ,Structural unit ,Protein secondary structure ,Biotechnology - Published
- 1994
- Full Text
- View/download PDF
25. Insights into protein function through large-scale computational analysis of sequence and structure
- Author
-
John P. Overington, Mark B. Swindells, and Malcolm Weir
- Subjects
Models, Molecular ,Sequence ,Biological data ,Drug discovery ,Protein Conformation ,Genetic Diseases, Inborn ,Computational Biology ,Proteins ,Genomics ,Bioengineering ,Computational biology ,Biology ,Bioinformatics ,Proteomics ,Biological Evolution ,ComputingMethodologies_PATTERNRECOGNITION ,Protein structure ,Protein sequencing ,Order (biology) ,Sequence Analysis, Protein ,Biotechnology - Abstract
Functional genomic and proteomic technologies are producing biological data relating to hundreds, or even thousands of proteins per experiment. Rapid and accurate computational analysis of the molecular function of these proteins is therefore crucial in order to interpret these data and prioritize further experiments.
- Published
- 2002
26. Chapter 21. The role of protein structure prediction in drug discovery
- Author
-
David T. Jones, Richard Joseph Fagan, and Mark B. Swindells
- Subjects
Protein structure ,biology ,Protein family ,Amyloid beta ,Drug discovery ,biology.protein ,Amyloid precursor protein ,Computational biology ,Target protein ,Protein structure prediction ,Threading (protein sequence) ,Bioinformatics - Abstract
Publisher Summary This chapter discusses the role of protein structure prediction in drug discovery. The most accurate method for predicting protein structure is to make use of comparative modeling techniques to infer the structure of a target protein based on the structure of a related template protein. The reliability and simplicity of this class of method stem from the fact that it is limited to predicting the structure of proteins that are closely related to the template protein of known structure. Threading methods predict the fold of a protein in the absence of any sequence similarity using a large library of folds as its database. Once a genome wide view of these protein families is known, prioritization of the sequences for movement into the drug discovery pipeline begins. In order for this to happen, the biological significance of the protein must be identified and disease relevance associated with it. A microarray approach has been used to study the role of cholesterol in the biosynthesis of beta-amyloid peptides. Antibodies to the target, here the beta-amyloid peptide are covalently linked to a chip in order to make a microarray. The profile of amyloid beta peptide variants secreted into the media of human cultured cells that express the amyloid precursor protein was examined using surface enhanced laser desorption/ionization (SELDI) ProteinChip technology.
- Published
- 2001
- Full Text
- View/download PDF
27. Intrinsic phi, psi propensities of amino acids, derived from the coil regions of known structures
- Author
-
Mark B. Swindells, Janet M. Thornton, and Malcolm W. MacArthur
- Subjects
Steric effects ,chemistry.chemical_classification ,Models, Molecular ,Chemistry ,Stereochemistry ,Hydrogen Bonding ,Biochemistry ,Protein Structure, Secondary ,Amino acid ,Crystallography ,Structure-Activity Relationship ,Structural Biology ,Electromagnetic coil ,Helix ,Genetics ,Amino Acids ,Protein secondary structure - Abstract
Many different factors contribute to secondary structure propensities, including phi, psi preferences, side-chain interactions, steric effects and hydrophobic tertiary contacts. To deconvolute these competing factors, we have adopted a novel approach which quantifies the intrinsic phi, psi propensities for residues in coil regions (that is, residues not in alpha-helix and not in beta-strand). Comparisons of intrinsic phi, psi propensities with their equivalent secondary structure propensities show that while correlations for helix are relatively weak, those for strand are much stronger. This paper describes our new phi, psi propensities and provides an explanation for the variations observed.
- Published
- 1995
28. Protein folds: towards understanding folding from inspection of native structures
- Author
-
Malcolm W. MacArthur, Mark B. Swindells, David T. Jones, Christine M. Orengo, and Janet M. Thornton
- Subjects
Steric effects ,Protein Folding ,Chemistry ,Protein domain ,Models, Theoretical ,Bioinformatics ,General Biochemistry, Genetics and Molecular Biology ,Protein Structure, Secondary ,Folding (chemistry) ,Protein structure ,Helix ,Biophysics ,Protein folding ,Computer Simulation ,Protein topology ,General Agricultural and Biological Sciences ,Protein secondary structure - Abstract
Following a short summary of some of the principal features of folded proteins, the results of two complementary studies of protein structure are presented, the first concerned with the factors which influence secondary structure propensity and the second an analysis of protein topology. In an attempt to deconvolute the physical contributions to secondary structure propensities, we have calculated intrinsic ɸ,ψ propensities, derived from the coil regions of proteins. Comparison of intrinsic ɸ,ψ propensities with their equivalent secondary structure values show correlations for both helix and strand. This suggests that the local dipeptide, steric and electrostatic interactions have a major influence on secondary structure propensity. We then proceed to inspect the distribution of protein domain folds observed to date. Several folds occur very commonly, so that 46% of the current non-homologous database comprises only nine folds. The implications of these results for protein folding are discussed.
- Published
- 1995
29. Classification of doubly wound nucleotide binding topologies using automated loop searches
- Author
-
Mark B. Swindells
- Subjects
Models, Molecular ,Flavodoxin ,Molecular Sequence Data ,Tryptophan synthase ,Computational biology ,Biology ,Biochemistry ,Malate Dehydrogenase ,Tryptophan Synthase ,Coenzyme binding ,Amino Acid Sequence ,Binding site ,Molecular Biology ,Peptide sequence ,Binding Sites ,L-Lactate Dehydrogenase ,Sequence Homology, Amino Acid ,Nucleotides ,Binding protein ,Proteins ,computer.file_format ,Protein Data Bank ,Protein Structure, Tertiary ,Loop (topology) ,Ferredoxin-NADP Reductase ,biology.protein ,computer ,Research Article - Abstract
A classification is presented of doubly wound alpha/beta nucleotide binding topologies, whose binding sites are located in the cleft formed by a topological switch point. In particular, the switch point loop nearest the N-terminus is used to identify specific structural classes of binding protein. This yields seven structurally distinct loop conformations, which are subsequently used as motifs for scanning the Protein Data Bank. The searches, which are effective at identifying functional relationships within a large database of structures, reveal a remarkable and previously unnoticed similarity between the coenzyme binding sites of flavodoxin and tryptophan synthetase, even though there is no sequence or topological similarity between them.
- Published
- 1993
30. A study of structural determinants in the interleukin-1 fold
- Author
-
Janet M. Thornton and Mark B. Swindells
- Subjects
Models, Molecular ,Protein Folding ,Chemical Phenomena ,Stereochemistry ,Protein Conformation ,Molecular Sequence Data ,Bioengineering ,Sequence alignment ,Biology ,Biochemistry ,Homology (biology) ,Erythrina trypsin inhibitor ,medicine ,Amino Acid Sequence ,Molecular Biology ,Conformational isomerism ,Conserved Sequence ,Sequence Homology, Amino Acid ,Chemistry, Physical ,Hydrogen Bonding ,Trypsin ,Interleukin 1β ,Fibroblast Growth Factor 2 ,Trypsin Inhibitors ,Biotechnology ,medicine.drug ,Interleukin-1 - Abstract
The structures of interleukin-1 beta, basic fibroblast growth factor and Erythrina trypsin inhibitor have been analysed in order to determine whether the hydrophobic core remains conserved, even when the structures have extremely low sequence similarities. We find that there are significant differences in the way each protein achieves a satisfactory arrangement of core residues and that positions which contribute to the core of one structure are not guaranteed to contribute to the integrity of another. Furthermore, the side-chain packing arrangements of these core residues vary significantly between the three structures. During this analysis the side-chain rotamers for three independently determined interleukin-1 beta structures were also compared. It was found that although buried residues are generally in agreement the remaining residues frequently occupy different rotamers in the three structures. This suggests that although meaningful studies are possible for buried side-chains the results obtained from equivalent analyses of accessible residues should be treated with caution. These results are discussed with specific reference to the optimization of side-chain packing in proteins of known structure.
- Published
- 1993
31. Prediction of a novel topology in the N-terminal, 14 kDa fragment of Ada protein
- Author
-
Mark B. Swindells
- Subjects
Models, Molecular ,Magnetic Resonance Spectroscopy ,Stereochemistry ,Molecular Sequence Data ,Biophysics ,Topology ,Biochemistry ,Protein Structure, Secondary ,O(6)-Methylguanine-DNA Methyltransferase ,Protein structure ,Chain (algebraic topology) ,Fragment (logic) ,Bacterial Proteins ,Structural Biology ,Genetics ,Ada protein ,Amino Acid Sequence ,Molecular Biology ,Protein secondary structure ,Topology (chemistry) ,Chemistry ,Escherichia coli Proteins ,Cell Biology ,Nuclear magnetic resonance spectroscopy ,Peptide Fragments ,Folding (chemistry) ,Molecular Weight ,Order (biology) ,Prediction ,Analysis ,Transcription Factors - Abstract
Previously determined protein structures have been analysed, in order to find folding motifs similar to that proposed by NMR spectroscopy, for the N-terminal, 14 kDa fragment of the Ada protein. The analyses reveal only limited similarities with the NMR-derived structural data and strongly suggest that this region of the Ada protein adopts a previously unobserved topology. Characteristic structural features, which arise from the inferred chain connectivity, are examined through comparisons with other structures. Using this information, the topology of the Ada protein 14 kDa fragment has been predicted in order to provide structural data not yet attainable from NMR experiments.
- Published
- 1993
32. Nicastrin, a presenilin-interacting protein, contains an aminopeptidase/transferrin receptor superfamily domain
- Author
-
Malcolm Weir, John P. Overington, Richard Joseph Fagan, and Mark B. Swindells
- Subjects
Molecular Sequence Data ,Nicastrin ,Transferrin receptor ,Aminopeptidases ,Biochemistry ,Aminopeptidase ,Presenilin ,Presenilin-2 ,Receptors, Transferrin ,Presenilin-1 ,Humans ,Amino Acid Sequence ,APH-1 ,Molecular Biology ,Membrane Glycoproteins ,Sequence Homology, Amino Acid ,biology ,Chemistry ,Membrane Proteins ,humanities ,Structural biology ,Domain (ring theory) ,biology.protein ,Amyloid Precursor Protein Secretases ,Protein Binding ,Binding domain - Abstract
Nicastrin, a protein implicated in Alzheimer's disease, has a domain that is found in the aminopeptidase/transferrin receptor superfamily. In nicastrin, this domain might possess catalytic activity (as observed with aminopeptidases) or it could serve merely as a binding domain (with analogy to the transferrin receptors) for the β-amyloid precursor protein.
- Published
- 2001
- Full Text
- View/download PDF
33. Structural similarity between transforming growth factor-beta 2 and nerve growth factor
- Author
-
Mark B. Swindells
- Subjects
Multidisciplinary ,Molecular Structure ,Sequence Homology, Amino Acid ,Chemistry ,Structural similarity ,Macromolecular Substances ,Molecular Sequence Data ,Mineralogy ,Protein Structure, Secondary ,Cell biology ,Nerve growth factor ,Transforming Growth Factor beta ,Amino Acid Sequence ,Nerve Growth Factors ,Transforming growth factor - Published
- 1992
34. Structure prediction and modelling
- Author
-
Mark B. Swindells
- Subjects
Models, Molecular ,Protein Conformation ,Biomedical Engineering ,Proteins ,Bioengineering ,General Agricultural and Biological Sciences ,Sequence Alignment ,General Biochemistry, Genetics and Molecular Biology ,Biotechnology - Abstract
Cracking the second fundamental code of molecular biology (how the tertiary structure of a protein is determined by its amino acid sequence) remains an elusive goal. However, the impetus to establish credible approximations, if not a definitive solution to this relationship, has never been greater. In the past year significant progress has been made through a series of novel approaches. This review describes the most important developments and outlines how they can be usefully employed by those whose specialization lies outside the field.
- Published
- 1992
35. Corrigendum to 'Target-induced Conformational Adaptation of Calmodulin Revealed by the Crystal Structure of a Complex with Nematode Ca2+/Calmodulin-dependent Kinase Kinase Peptide' [J. Mol. Biol. (2001) 312, 59–68]
- Author
-
Hiroshi Tokumitsu, Masatsune Kainosho, Hirofumi Kurokawa, Naoko Katayama, Mark B. Swindells, Mitsuhiko Ikura, Hiroyuki Kurihara, and Masanori Osawa
- Subjects
chemistry.chemical_classification ,biology ,Calmodulin ,Kinase ,Peptide ,Crystal structure ,biology.organism_classification ,Molecular biology ,Cell biology ,Nematode ,Biochemistry ,chemistry ,Structural Biology ,biology.protein ,Adaptation ,Molecular Biology ,Ca2 calmodulin - Published
- 2005
- Full Text
- View/download PDF
36. Loopy similarities
- Author
-
Mark B Swindells
- Subjects
Structural Biology ,Molecular Biology - Published
- 1994
- Full Text
- View/download PDF
37. Coordination of acetate with the di–iron centre of methane monooxygenase
- Author
-
Mark B. Swindells
- Subjects
Models, Molecular ,Binding Sites ,Molecular Structure ,biology ,Chemistry ,Methane monooxygenase ,Iron ,Inorganic chemistry ,Acetates ,Hemerythrin ,Structural Biology ,Oxygenases ,biology.protein ,Organic chemistry ,Molecular Biology ,Acetic Acid - Published
- 1994
- Full Text
- View/download PDF
38. Prediction of progress at last
- Author
-
T.P. Flores, Mark B. Swindells, Janet M. Thornton, and David T. Jones
- Subjects
Multidisciplinary ,Chemistry ,Fold (geology) ,Molecular biology ,Protein secondary structure - Published
- 1991
- Full Text
- View/download PDF
39. Who uses CD-ROMs in the information age?
- Author
-
Mark B. Swindells
- Subjects
Information Services ,Computer Communication Networks ,Information Age ,CD-ROM ,Databases, Factual ,Biology ,Molecular Biology ,Biochemistry ,Demography - Published
- 1997
- Full Text
- View/download PDF
40. Shared structural motif in proteins
- Author
-
Laurence H. Pearl, Janet M. Thornton, Mark B. Swindells, Christine A. Orengo, and David T. Jones
- Subjects
Multidisciplinary ,Chemistry ,Computational biology ,Structural motif - Published
- 1993
- Full Text
- View/download PDF
41. Recurrence of a binding motif?
- Author
-
Laurence H. Pearl, Christine A. Orengo, Janet M. Thornton, Mark B. Swindells, and David T. Jones
- Subjects
Protein Folding ,Multidisciplinary ,Bacterial Proteins ,Chemistry ,Molecular Sequence Data ,Amino Acid Sequence ,Motif (music) ,Computational biology ,Phosphoenolpyruvate Sugar Phosphotransferase System ,Sequence Alignment ,Phosphoric Monoester Hydrolases ,Acid Anhydride Hydrolases - Published
- 1993
- Full Text
- View/download PDF
42. CATH – a hierarchic classification of protein domain structures
- Author
-
Susan Jones, A.D. Michie, David T. Jones, Mark B. Swindells, Janet M. Thornton, and Christine A. Orengo
- Subjects
Protein structure database ,Models, Molecular ,Protein Folding ,Databases, Factual ,Sequence Homology, Amino Acid ,Simple Modular Architecture Research Tool ,Protein domain ,Structural alignment ,Proteins ,Computational biology ,Structural Classification of Proteins database ,Biology ,Protein Structure, Secondary ,Protein Structure, Tertiary ,protein structure classification ,Crystallography ,Protein structure ,Structural Biology ,evolution ,fold families ,Threading (protein sequence) ,Molecular Biology ,Protein Structure Initiative - Abstract
Background: Protein evolution gives rise to families of structurally related proteins, within which sequence identities can be extremely low. As a result, structure-based classifications can be effective at identifying unanticipated relationships in known structures and in optimal cases function can also be assigned. The ever increasing number of known protein structures is too large to classify all proteins manually, therefore, automatic methods are needed for fast evaluation of protein structures. Results: We present a semi-automatic procedure for deriving a novel hierarchical classification of protein domain structures (CATH). The four main levels of our classification are protein class (C), architecture (A), topology (T) and homologous superfamily (H). Class is the simplest level, and it essentially describes the secondary structure composition of each domain. In contrast, architecture summarises the shape revealed by the orientations of the secondary structure units, such as barrels and sandwiches. At the topology level, sequential connectivity is considered, such that members of the same architecture might have quite different topologies. When structures belonging to the same T-level have suitably high similarities combined with similar functions, the proteins are assumed to be evolutionarily related and put into the same homologous superfamily. Conclusions: Analysis of the structural families generated by CATH reveals the prominent features of protein structure space. We find that nearly a third of the homologous superfamilies (H-levels) belong to ten major T-levels, which we call superfolds, and furthermore that nearly two-thirds of these H-levels cluster into nine simple architectures. A database of well-characterised protein structure families, such as CATH, will facilitate the assignment of structure–function/ evolution relationships to both known and newly determined protein structures.
- Full Text
- View/download PDF
43. Solution structure of the IRF-2 DNA-binding domain: a novel subgroup of the winged helix–turn–helix family
- Author
-
Hisashi Harada, Mark B. Swindells, Toshio Yamazaki, Tadatsugu Taniguchi, Junichi Furui, Yoshimasa Kyogoku, Koichi Uegaki, and Masahiro Shirakawa
- Subjects
Subfamily ,Magnetic Resonance Spectroscopy ,Molecular Sequence Data ,Beta sheet ,Random hexamer ,Biology ,Antiparallel (biochemistry) ,Protein Structure, Secondary ,Evolution, Molecular ,Kluyveromyces ,Mice ,Bacterial Proteins ,Structural Biology ,Transcription (biology) ,Animals ,Amino Acid Sequence ,DNA-binding domain ,Molecular Biology ,Gene ,Helix-Turn-Helix Motifs ,Genetics ,winged helix–turn–helix (wHTH) ,DNA ,Interferon-beta ,Recombinant Proteins ,NMR ,DNA-Binding Proteins ,Repressor Proteins ,interferon regulatory factor-2 (IRF-2) ,Nucleic Acid Conformation ,sequential structure alignment program (SSAP) ,Sequence Alignment ,Interferon Regulatory Factor-2 ,Interferon regulatory factors ,Transcription Factors - Abstract
Background: The transcription of interferon (IFN) and IFN-inducible genes is mainly regulated by the interferon regulatory factor (IRF) family of proteins, which recognize a unique AAGTGA hexamer repeat motif in the regulatory region of IFN genes. A DNA-binding domain of approximately 100 amino acids has been commonly found in the IRF family of proteins, but it has no sequence homology to known DNA-binding motifs. Elucidation of the structures of members of the IRF family is therefore useful to the understanding of the regulation and evolution of the immune system at the structural level. Results: The solution structure of the DNA-binding domain of interferon regulatory factor-2 (IRF-2) has been determined by NMR spectroscopy. It is composed of a four-stranded antiparallel β sheet and three α helices, and its global fold is similar to those of the winged helix–turn–helix (wHTH) family of proteins. A long loop (Pro37–Asp51) is found immediately before the HTH motif, which is not found in other wHTH proteins. The NMR signals of residues in this long loop, as well as the second helix of the HTH motif, are strongly affected upon the addition of the hexamer repeat DNA, suggesting that these structural elements participate in DNA recognition and binding. Conclusions: The structural similarity of the DNA-binding domain of IRF-2 with those of proteins in the wHTH family shows that the IRF proteins belong to the wHTH family, even though there is no apparent sequence homology among proteins of the two families. The sequential structure alignment program (SSAP) shows that IRF-2 has a slightly different structure from typical wHTH proteins, mainly in the orientation of helix 2. The IRF family of proteins should therefore be categorized into a subfamily of the wHTH family. The evidence here implies that the evolutional pathway of the IRF family is distinct from that of the other wHTH proteins, in other words, the immune system diverged from an evolutional stem at an early stage.
- Full Text
- View/download PDF
44. GENIUS II: a high-throughput database system for linking ORFs in complete genomes to known protein three-dimensional structures.
- Author
-
Yukimitsu Yabuki, Yuri Mukai, Mark B. Swindells, and Makiko Suwa
- Published
- 2004
- Full Text
- View/download PDF
45. Genome analysis: Assigning protein coding regions to three-dimensional structures
- Author
-
Asaf Salamov, Mark B. Swindells, Christine A. Orengo, and Makiko Suwa
- Subjects
Genetics ,Fold (higher-order function) ,Databases, Factual ,Protein Conformation ,Sequence alignment ,Computational biology ,Sequence Analysis, DNA ,Biology ,Biochemistry ,Genome ,Sensitivity and Specificity ,Protein Structure, Tertiary ,Protein structure ,Database search engine ,Computer Simulation ,ORFS ,Molecular Biology ,Gene ,Sequence Alignment ,Algorithms ,Sequence (medicine) ,Research Article - Abstract
We describe the results of a procedure for maximizing the number of sequences that can be reliably linked to a protein of known three-dimensional structure. Unlike other methods, which try to increase sensitivity through the use of fold recognition software, we only use conventional sequence alignment tools, but apply them in a manner that significantly increases the number of relationships detected. We analyzed 11 genomes and found that, depending on the genome, between 23 and 32% of the ORFs had significant matches to proteins of known structure. In all cases, the aligned region consisted of either >100 residues or >50% of the smaller sequence. Slightly higher percentages could be attained if smaller motifs were also included. This is significantly higher than most previously reported methods, even those that have a fold-recognition component. We survey the biochemical and structural characteristics of the most frequently occurring proteins, and discuss the extent to which alignment methods can realistically assign function to gene products.
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.