Back to Search Start Over

Recovery of non-reference sequences missing from the human reference genome

Authors :
Ran Li
Xiaomeng Tian
Peng Yang
Yingzhi Fan
Ming Li
Hongxiang Zheng
Xihong Wang
Yu Jiang
Source :
BMC Genomics, Vol 20, Iss 1, Pp 1-11 (2019)
Publication Year :
2019
Publisher :
BMC, 2019.

Abstract

Abstract Background The non-reference sequences (NRS) represent structure variations in human genome with potential functional significance. However, besides the known insertions, it is currently unknown whether other types of structure variations with NRS exist. Results Here, we compared 31 human de novo assemblies with the current reference genome to identify the NRS and their location. We resolved the precise location of 6113 NRS adding up to 12.8 Mb. Besides 1571 insertions, we detected 3041 alternate alleles, which were defined as having less than 90% (or none) identity with the reference alleles. These alternate alleles overlapped with 1143 protein-coding genes including a putative novel MHC haplotype. Further, we demonstrated that the alternate alleles and their flanking regions had high content of tandem repeats, indicating that their origin was associated with tandem repeats. Conclusions Our study detected a large number of NRS including many alternate alleles which are previously uncharacterized. We suggested that the origin of alternate alleles was associated with tandem repeats. Our results enriched the spectrum of genetic variations in human genome.

Details

Language :
English
ISSN :
14712164
Volume :
20
Issue :
1
Database :
Directory of Open Access Journals
Journal :
BMC Genomics
Publication Type :
Academic Journal
Accession number :
edsdoj.3b6faebd6cb64503a3d060aaea865e66
Document Type :
article
Full Text :
https://doi.org/10.1186/s12864-019-6107-1