51. Disentangling the complexity of low complexity proteins
- Author
-
Mier, Pablo, Bernadó, Pau, Gáspári, Zoltán, Ouzounis, Christos A., Promponas, Vasilis J., Kajava, Andrey V., Hancock, John M., Tosatto, Silvio C. E., Dosztanyi, Zsuzsanna, Andrade-Navarro, Miguel A., Paladin, Lisanna, Tamana, Stella, Petrosian, Sophia, Hajdu-Soltész, Borbála, Urbanek, Annika, Gruca, Aleksandra, Plewczynski, Dariusz, Grynberg, Marcin, Promponas, Vasilis J. [0000-0003-3352-4831], University Medical Center of the Johannes Gutenberg-University Mainz, Azienda Ospedaliera di Padova, University of Cyprus [Nicosia], Eötvös Loránd University (ELTE), Centre de Biochimie Structurale [Montpellier] (CBS), Centre National de la Recherche Scientifique (CNRS)-Université de Montpellier (UM)-Institut National de la Santé et de la Recherche Médicale (INSERM), Silesian University of Technology, Warsaw University of Technology [Warsaw], Pázmány Péter Catholic University, Centre de recherche en Biologie Cellulaire (CRBM), Université Montpellier 2 - Sciences et Techniques (UM2)-Centre National de la Recherche Scientifique (CNRS)-Université de Montpellier (UM)-Université Montpellier 1 (UM1), National Research University of Information Technologies, Mechanics and Optics [St. Petersburg] (ITMO), Earlham Institute [Norwich], and Department of Chemical Sciences University of Padova and Institute on Membrane Technology, Unit of Padova, via F. Marzolo 1, Padova, 35131
- Subjects
Protein Conformation ,Computer science ,Review Article ,Computational biology ,Measure (mathematics) ,Evolution, Molecular ,Low complexity ,03 medical and health sciences ,Protein Domains ,Amino Acid Sequence ,structure ,[SDV.BBM.BC]Life Sciences [q-bio]/Biochemistry, Molecular Biology/Biochemistry [q-bio.BM] ,Databases, Protein ,Molecular Biology ,030304 developmental biology ,Structure (mathematical logic) ,0303 health sciences ,Sequence ,[SCCO.NEUR]Cognitive science/Neuroscience ,composition bias ,030302 biochemistry & molecular biology ,Proteins ,disorder ,low complexity regions ,Structure and function ,[INFO.INFO-BI]Computer Science [cs]/Bioinformatics [q-bio.QM] ,Algorithms ,Information Systems - Abstract
There are multiple definitions for low complexity regions (LCRs) in protein sequences, with all of them broadly considering LCRs as regions with fewer amino acid types compared to an average composition. Following this view, LCRs can also be defined as regions showing composition bias. In this critical review, we focus on the definition of sequence complexity of LCRs and their connection with structure. We present statistics and methodological approaches that measure low complexity (LC) and related sequence properties. Composition bias is often associated with LC and disorder, but repeats, while compositionally biased, might also induce ordered structures. We illustrate this dichotomy, and more generally the overlaps between different properties related to LCRs, using examples. We argue that statistical measures alone cannot capture all structural aspects of LCRs and recommend the combined usage of a variety of predictive tools and measurements. While the methodologies available to study LCRs are already very advanced, we foresee that a more comprehensive annotation of sequences in the databases will enable the improvement of predictions and a better understanding of the evolution and the connection between structure and function of LCRs. This will require the use of standards for the generation and exchange of data describing all aspects of LCRs. Short abstract There are multiple definitions for low complexity regions (LCRs) in protein sequences. In this critical review, we focus on the definition of sequence complexity of LCRs and their connection with structure. We present statistics and methodological approaches that measure low complexity (LC) and related sequence properties. Composition bias is often associated with LC and disorder, but repeats, while compositionally biased, might also induce ordered structures. We illustrate this dichotomy, plus overlaps between different properties related to LCRs, using examples.
- Published
- 2019