4 results on '"Mariane Gonçalves Kulik"'
Search Results
2. Low Complexity Induces Structure in Protein Regions Predicted as Intrinsically Disordered
- Author
-
Mariane Gonçalves-Kulik, Pablo Mier, Kristina Kastano, Juan Cortés, Pau Bernadó, Friederike Schmid, and Miguel A. Andrade-Navarro
- Subjects
intrinsically disordered regions ,low complexity regions ,protein structure ,homorepeats ,Microbiology ,QR1-502 - Abstract
There is increasing evidence that many intrinsically disordered regions (IDRs) in proteins play key functional roles through interactions with other proteins or nucleic acids. These interactions often exhibit a context-dependent structural behavior. We hypothesize that low complexity regions (LCRs), often found within IDRs, could have a role in inducing local structure in IDRs. To test this, we predicted IDRs in the human proteome and analyzed their structures or those of homologous sequences in the Protein Data Bank (PDB). We then identified two types of simple LCRs within IDRs: regions with only one (polyX or homorepeats) or with only two types of amino acids (polyXY). We were able to assign structural information from the PDB more often to these LCRs than to the surrounding IDRs (polyX 61.8% > polyXY 50.5% > IDRs 39.7%). The most frequently observed polyX and polyXY within IDRs contained E (Glu) or G (Gly). Structural analyses of these sequences and of homologs indicate that polyEK regions induce helical conformations, while the other most frequent LCRs induce coil structures. Our work proposes bioinformatics methods to help in the study of the structural behavior of IDRs and provides a solid basis suggesting a structuring role of LCRs within them.
- Published
- 2022
- Full Text
- View/download PDF
3. RepeatsDB in 2021: improved data and extended classification for protein tandem repeat structures
- Author
-
Pablo Lorenzano Menna, Martina Bevilacqua, Mariane Gonçalves Kulik, Alexander Miguel Monzon, Lisanna Paladin, José Luis López, Martin Gonzalez Buitron, Javier Rios, Marco Necci, Sara Errigo, Layla Hirsh, Ivan Mičetić, Juliet F. Nilsson, Andrey V. Kajava, María Silvina Fornasari, Antonio Lagares, Damiano Piovesan, Sebastian Fernandez-Alberti, Maia Diana Eliana Cabrera, Gustavo Parisi, María Laura Fabre, Miguel A. Andrade-Navarro, Silvio C. E. Tosatto, Centre de recherche en Biologie Cellulaire (CRBM), and Université Montpellier 2 - Sciences et Techniques (UM2)-Centre National de la Recherche Scientifique (CNRS)-Université de Montpellier (UM)-Université Montpellier 1 (UM1)
- Subjects
Repetitive Sequences, Amino Acid ,AcademicSubjects/SCI00010 ,Biología ,Statistics as Topic ,Protein Data Bank (RCSB PDB) ,Computational biology ,Biology ,Repetitive Sequences ,Gene Ontology ,HEK293 Cells ,HeLa Cells ,Humans ,Proteins ,Reproducibility of Results ,User-Computer Interface ,Databases, Protein ,Tandem Repeat Sequences ,Databases ,03 medical and health sciences ,Annotation ,Protein structure ,Similarity (network science) ,Tandem repeat ,Genetics ,Database Issue ,Ciencias Exactas ,database ,030304 developmental biology ,0303 health sciences ,Hierarchy (mathematics) ,Protein ,030302 biochemistry & molecular biology ,computer.file_format ,Protein Data Bank ,Class (biology) ,proteins ,Amino Acid ,ComputingMethodologies_PATTERNRECOGNITION ,classification ,protein tandem repeat structures ,[INFO.INFO-BI]Computer Science [cs]/Bioinformatics [q-bio.QM] ,computer - Abstract
The RepeatsDB database (URL: https://repeatsdb.org/) provides annotations and classification for protein tandem repeat structures from the Protein Data Bank (PDB). Protein tandem repeats are ubiquitous in all branches of the tree of life. The accumulation of solved repeat structures provides new possibilities for classification and detection, but also increasing the need for annotation. Here we present RepeatsDB 3.0, which addresses these challenges and presents an extended classification scheme. The major conceptual change compared to the previous version is the hierarchical classification combining top levels based solely on structural similarity (Class > Topology > Fold) with two new levels (Clan > Family) requiring sequence similarity and describing repeat motifs in collaboration with Pfam. Data growth has been addressed with improved mechanisms for browsing the classification hierarchy. A new UniProt-centric view unifies the increasingly frequent annotation of structures from identical or similar sequences. This update of RepeatsDB aligns with our commitment to develop a resource that extracts, organizes and distributes specialized information on tandem repeat protein structures., Facultad de Ciencias Exactas, Instituto de Biotecnologia y Biologia Molecular
- Published
- 2020
- Full Text
- View/download PDF
4. SWeeP: representing large biological sequences datasets in compact vectors
- Author
-
Fábio O. Pedrosa, Dieval Guizelini, Ricardo Voyceik, Mariane Gonçalves Kulik, Antonio Camilo da Silva Filho, J. Miguel Ortega, Bruno Thiago de Lima Nichio, Aryel Marlus Repula de Oliveira, Letícia Graziela Costa Santos de Mattos, Roberto Tadeu Raittz, Camilla Reginatto De Pierri, Jeroniza Nunes Marchaukoski, and Josué Oliveira Camargo
- Subjects
Proteome ,Computer science ,Datasets as Topic ,lcsh:Medicine ,Article ,Mitochondrial Proteins ,Bacterial Proteins ,Humans ,Computational models ,Orthonormal basis ,Representation (mathematics) ,lcsh:Science ,Data mining ,Phylogeny ,Sequence ,Multidisciplinary ,Phylogenetic tree ,lcsh:R ,Computational Biology ,Base (topology) ,Mitochondria ,Projection (relational algebra) ,lcsh:Q ,Sequence Alignment ,Algorithm ,Algorithms ,Software - Abstract
Vectoral and alignment-free approaches to biological sequence representation have been explored in bioinformatics to efficiently handle big data. Even so, most current methods involve sequence comparisons via alignment-based heuristics and fail when applied to the analysis of large data sets. Here, we present “Spaced Words Projection (SWeeP)”, a method for representing biological sequences using relatively small vectors while preserving intersequence comparability. SWeeP uses spaced-words by scanning the sequences and generating indices to create a higher-dimensional vector that is later projected onto a smaller randomly oriented orthonormal base. We constructed phylogenetic trees for all organisms with mitochondrial and bacterial protein data in the NCBI database. SWeeP quickly built complete and accurate trees for these organisms with low computational cost. We compared SWeeP to other alignment-free methods and Sweep was 10 to 100 times quicker than the other techniques. A tool to build SWeeP vectors is available at https://sourceforge.net/projects/spacedwordsprojection/.
- Published
- 2020
- Full Text
- View/download PDF
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.