Back to Search Start Over

A comprehensive whole genome database of ethnic minority populations

Authors :
Yan He
Changgui Lei
Chanjuan Wan
Shuang Zeng
Ting Zhang
Fei Luo
Ruichao Li
Xiaokun Li
Anshu Zhao
Defu Xiao
Yunyan Luo
Keren Shan
Xiaolan Qi
Xin Jin
Source :
Scientific Reports, Vol 14, Iss 1, Pp 1-9 (2024)
Publication Year :
2024
Publisher :
Nature Portfolio, 2024.

Abstract

Abstract China, is characterized by its remarkable ethnical diversity, which necessitates whole genome variation data from multiple populations as crucial tools for advancing population genetics and precision medical research. However, there has been a scarcity of research concentrating on the whole genome of ethnic minority groups. To fill this gap, we developed the Guizhou Multi-ethnic Genome Database (GMGD). It comprises whole genome sequencing data from 476 healthy unrelated individuals spanning 11 ethnic minorities groups in Guizhou Province, Southwest China, including Bouyei, Dong, Miao, Yi, Bai, Gelo, Zhuang, Tujia, Yao, Hui, and Sui. The GMGD database comprises more than 16.33 million variants in GRCh38 and 16.20 million variants in GRCh37. Among these, approximately 11.9% (1,956,322) of the variants in GRCh38 and 18.5% (3,009,431) of the variants in GRCh37 are entirely new and do not exist in the dbSNP database. These novel variants shed light on the genetic diversity landscape across these populations, providing valuable insights with an average coverage of 5.5 ×. This makes GMGD the largest genome-wide database encompassing the most diverse ethnic groups to date. The GMGD interactive interface facilitates researchers with multi-dimensional mutation search methods and displays population frequency differences among global populations. Furthermore, GMGD is equipped with a genotype-imputation function, enabling enhanced capabilities for low-depth genomic research or targeted region capture studies. GMGD offers unique insights into the genomic variation landscape of different ethnic groups, which are freely accessible at https://db.cngb.org/pop/gmgd/ .

Details

Language :
English
ISSN :
20452322
Volume :
14
Issue :
1
Database :
Directory of Open Access Journals
Journal :
Scientific Reports
Publication Type :
Academic Journal
Accession number :
edsdoj.202551a896374560997b496f94b5d9ca
Document Type :
article
Full Text :
https://doi.org/10.1038/s41598-024-63892-1