1. Direct mapping of symbolic DNA sequence into frequency domain and identification of higher order repeats
- Author
-
Glunčić, M., Paar, V., Basar, I., Vlahović, I., Rosandić, M., Dekanić, K., Citković, M., Jelovina D., Paar, P., Kelić, A., Batista, J., and Vladimir Paar
- Subjects
Human genome ,primates genome ,Higher order repeat (HOR) ,Tandem repeat ,Alpha satellite ,Repeat units ,Global Repeat Map ,Evolution genetics ,gene regulatory network - Abstract
We have introduced and developed new repeat finding algorithm Global Repeat Map (GRM). This method enables for the first time direct mapping of symbolic DNA sequence into frequency domain, which is very robust and enables identification of repeats and higher order repeats with very long and sizeably mutated repeat copies, which were not found by previously known methods. It is well known that higher primates share nearly 98% of human DNA in genes, but given the substantial number of neutral mutations, only a small subset of the observed gene differences is likely to be responsible for the key phenotypic changes. It was proposed that phenotypic differences are mainly caused by regulatory changes in gene expression. In order to contribute to insight into mechanism of gene expression, we study and compare tandems, higher order repeats (HORs) and regularly dispersed repeats in human and other primates genome, using robust Global Repeat Map (GRM) algorithm. We show that substantial human-primates differences are concentrated in large repeat structures, and occur on two different levels: from nucleotide substitutions to large-scale structural alteration of the genome. We found in chromosome Y that human-chimpanzee nucleotide substitution divergence within large repeat structures, is at level as much as ~ 70%. Smeared over the whole chromosome Y sequenced assembly this gives ~14% human-chimpanzee divergence. This is significantly higher estimate of divergence between human and chimpanzee than previous estimates. At a structural difference level, we found in human genome rapid evolution of structural higher periodicity organization, referred as Human Accelerated HOR Regions (HAHORs). We hypothesize on possible importance of human accelerated HOR pattern as components in gene expression multi-layered regulatory network.
- Published
- 2013