1. Expansion and re-classification of the extracytoplasmic function (ECF) σ factor family
- Author
-
Carol A. Gross, Georg Fritz, Karina Brinkrolf, Alexander Goesmann, Raphael Rene Müller, Anke Becker, Thorsten Mascher, Mark J. Buttner, Sebastian Jaenicke, and Delia Casas-Pastor
- Subjects
Protein family ,AcademicSubjects/SCI00010 ,Sigma Factor ,Sequence alignment ,Computational biology ,Bacterial genome size ,Biology ,Genome ,Substrate Specificity ,03 medical and health sciences ,Bacterial Proteins ,Phylogenetics ,Terminology as Topic ,Consensus Sequence ,Genetics ,Consensus sequence ,Amino Acid Sequence ,Gene ,Phylogeny ,030304 developmental biology ,0303 health sciences ,Phylogenetic tree ,030306 microbiology ,DNA-Directed RNA Polymerases ,Gene Expression Regulation, Bacterial ,Genomics ,Multigene Family ,Sequence Alignment ,Signal Transduction - Abstract
Extracytoplasmic function σ factors (ECFs) represent one of the major bacterial signal transduction mechanisms in terms of abundance, diversity and importance, particularly in mediating stress responses. Here, we performed a comprehensive phylogenetic analysis of this protein family by scrutinizing all proteins in the NCBI database. As a result, we identified an average of ∼10 ECFs per bacterial genome and 157 phylogenetic ECF groups that feature a conserved genetic neighborhood and a similar regulation mechanism. Our analysis expands previous classification efforts ∼50-fold, enriches many original ECF groups with previously unclassified proteins and identifies 22 entirely new ECF groups. The ECF groups are hierarchically related to each other and are further composed of subgroups with closely related sequences. This two-tiered classification allows for the accurate prediction of common promoter motifs and the inference of putative regulatory mechanisms across subgroups composing an ECF group. This comprehensive, high-resolution description of the phylogenetic distribution of the ECF family, together with the massive expansion of classified ECF sequences and an openly accessible data repository called ‘ECF Hub’ (https://www.computational.bio.uni-giessen.de/ecfhub), will serve as a powerful hypothesis-generator to guide future research in the field.
- Published
- 2021