1. Phylogenetic analysis of Harmonin homology domains
- Author
-
Colcombet-Cazenave, Baptiste, Druart, Karen, Bonnet, Crystel, Petit, Christine, Spérandio, Olivier, Guglielmini, Julien, Wolff, Nicolas, Récepteurs Canaux - Channel Receptors, Centre National de la Recherche Scientifique (CNRS)-Institut Pasteur [Paris], Collège doctoral [Sorbonne universités], Sorbonne Université (SU), Bioinformatique structurale - Structural Bioinformatics, Institut Pasteur [Paris]-Centre National de la Recherche Scientifique (CNRS), Génétique et Physiologie de l'Audition, Institut Pasteur [Paris]-Institut National de la Santé et de la Recherche Médicale (INSERM)-Sorbonne Université (SU), Institut de l'Audition [Paris] (IDA), Institut Pasteur [Paris]-Institut National de la Santé et de la Recherche Médicale (INSERM), Collège de France (CdF (institution)), Hub Bioinformatique et Biostatistique - Bioinformatics and Biostatistics HUB, This work was supported by the Ministère de l’Enseignement Supérieur et de la Recherche (Grant No. 3178/2018 to B.C.C.), Institut Pasteur [Paris] (IP)-Centre National de la Recherche Scientifique (CNRS), Collège Doctoral, Institut Pasteur [Paris] (IP)-Institut National de la Santé et de la Recherche Médicale (INSERM)-Sorbonne Université (SU), Institut Pasteur [Paris] (IP)-Institut National de la Santé et de la Recherche Médicale (INSERM), and Gestionnaire, Hal Sorbonne Université
- Subjects
Fetal Growth Retardation ,Harmonin homology domains ,QH301-705.5 ,[SDV]Life Sciences [q-bio] ,Computer applications to medicine. Medical informatics ,Sequence analysis ,R858-859.7 ,Membrane Proteins ,Dyskeratosis Congenita ,[SDV] Life Sciences [q-bio] ,Profile HMM ,Screening ,Humans ,Amino Acid Sequence ,Biology (General) ,Usher syndrome ,Phylogeny ,Research Article - Abstract
Background Harmonin Homogy Domains (HHD) are recently identified orphan domains of about 70 residues folded in a compact five alpha-helix bundle that proved to be versatile in terms of function, allowing for direct binding to a partner as well as regulating the affinity and specificity of adjacent domains for their own targets. Adding their small size and rather simple fold, HHDs appear as convenient modules to regulate protein–protein interactions in various biological contexts. Surprisingly, only nine HHDs have been detected in six proteins, mainly expressed in sensory neurons. Results Here, we built a profile Hidden Markov Model to screen the entire UniProtKB for new HHD-containing proteins. Every hit was manually annotated, using a clustering approach, confirming that only a few proteins contain HHDs. We report the phylogenetic coverage of each protein and build a phylogenetic tree to trace the evolution of HHDs. We suggest that a HHD ancestor is shared with Paired Amphipathic Helices (PAH) domains, a four-helix bundle partially sharing fold and functional properties. We characterized amino-acid sequences of the various HHDs using pairwise BLASTP scoring coupled with community clustering and manually assessed sequence features among each individual family. These sequence features were analyzed using reported structures as well as homology models to highlight structural motifs underlying HHDs fold. We show that functional divergence is carried out by subtle differences in sequences that automatized approaches failed to detect. Conclusions We provide the first HHD databases, including sequences and conservation, phylogenic trees and a list of HHD variants found in the auditory system, which are available for the community. This case study highlights surprising phylogenetic properties found in orphan domains and will assist further studies of HHDs. We unveil the implication of HHDs in their various binding interfaces using conservation across families and a new protein–protein surface predictor. Finally, we discussed the functional consequences of three identified pathogenic HHD variants involved in Hoyeraal-Hreidarsson syndrome and of three newly reported pathogenic variants identified in patients suffering from Usher Syndrome. Supplementary information The online version contains supplementary material available at 10.1186/s12859-021-04116-5.
- Published
- 2021