Back to Search
Start Over
Atypical structural tendencies among low-complexity domains in the Protein Data Bank proteome
- Source :
- PLoS Computational Biology, Vol 16, Iss 1, p e1007487 (2020), PLoS Computational Biology
- Publication Year :
- 2020
- Publisher :
- Public Library of Science (PLoS), 2020.
-
Abstract
- A variety of studies have suggested that low-complexity domains (LCDs) tend to be intrinsically disordered and are relatively rare within structured proteins in the Protein Data Bank (PDB). Although LCDs are often treated as a single class, we previously found that LCDs enriched in different amino acids can exhibit substantial differences in protein metabolism and function. Therefore, we wondered whether the structural conformations of LCDs are likewise dependent on which specific amino acids are enriched within each LCD. Here, we directly examined relationships between enrichment of individual amino acids and secondary structure tendencies across the entire PDB proteome. Secondary structure tendencies varied as a function of the identity of the amino acid enriched and its degree of enrichment. Furthermore, divergence in secondary structure profiles often occurred for LCDs enriched in physicochemically similar amino acids (e.g. valine vs. leucine), indicating that LCDs composed of related amino acids can have distinct secondary structure tendencies. Comparison of LCD secondary structure tendencies with numerous pre-existing secondary structure propensity scales resulted in relatively poor correlations for certain types of LCDs, indicating that these scales may not capture secondary structure tendencies as sequence complexity decreases. Collectively, these observations provide a highly resolved view of structural tendencies among LCDs parsed by the nature and magnitude of single amino acid enrichment.<br />Author summary The structures that proteins adopt are directly related to their amino acid sequences. Low-complexity domains (LCDs) in protein sequences are unusual regions made up of only a few different types of amino acids. Although this is the key feature that classifies sequences as LCDs, the physical properties of LCDs will differ based on the types of amino acids that are found in each domain. For example, the sequences “AAAAAAAAAA”, “EEEEEEEEEE”, and “EEKRKEEEKE” will have very different properties, even though they would all be classified as LCDs by traditional methods. In a previous study, we developed a new method to further divide LCDs into categories that more closely reflect the differences in their physical properties. In this study, we apply that approach to examine the structures of LCDs when sorted into different categories based on their amino acids. This allowed us to define relationships between the types of amino acids in the LCDs and their corresponding structures. Since protein structure is closely related to protein function, this has important implications for understanding the basic functions and properties of LCDs in a variety of proteins.
- Subjects :
- 0301 basic medicine
Proteomics
Proteome
Proteomes
Protein Data Bank (RCSB PDB)
Protein metabolism
Protein Sequencing
Biochemistry
Protein Structure, Secondary
chemistry.chemical_compound
Database and Informatics Methods
Protein Structure Databases
0302 clinical medicine
Protein sequencing
Protein structure
Macromolecular Structure Analysis
Amino Acids
Biology (General)
Databases, Protein
Protein secondary structure
chemistry.chemical_classification
0303 health sciences
Ecology
Proteomic Databases
computer.file_format
Amino acid
Computational Theory and Mathematics
Modeling and Simulation
Amino Acid Analysis
Structural Proteins
Leucine
Algorithms
Research Article
Protein Structure
QH301-705.5
Protein domain
030303 biophysics
Computational biology
Research and Analysis Methods
Cellular and Molecular Neuroscience
03 medical and health sciences
Protein Domains
Valine
Genetics
Amino Acid Sequence
Molecular Biology Techniques
Sequencing Techniques
Molecular Biology
Ecology, Evolution, Behavior and Systematics
030304 developmental biology
Molecular Biology Assays and Analysis Techniques
Proteins
Biology and Life Sciences
Protein Data Bank
030104 developmental biology
Biological Databases
chemistry
Evolutionary biology
computer
030217 neurology & neurosurgery
Subjects
Details
- Language :
- English
- ISSN :
- 15537358
- Volume :
- 16
- Issue :
- 1
- Database :
- OpenAIRE
- Journal :
- PLoS Computational Biology
- Accession number :
- edsair.doi.dedup.....ac5c639036c20ff84f15701286b43af2