42 results on '"Haiyi Lou"'
Search Results
2. Analysis of five deep-sequenced trio-genomes of the Peninsular Malaysia Orang Asli and North Borneo populations
- Author
-
Lian Deng, Haiyi Lou, Xiaoxi Zhang, Bhooma Thiruvahindrapuram, Dongsheng Lu, Christian R. Marshall, Chang Liu, Bo Xie, Wanxing Xu, Lai-Ping Wong, Chee-Wei Yew, Aghakhanian Farhang, Rick Twee-Hee Ong, Mohammad Zahirul Hoque, Abdul Rahman Thuhairah, Bhak Jong, Maude E. Phipps, Stephen W. Scherer, Yik-Ying Teo, Subbiah Vijay Kumar, Boon-Peng Hoh, and Shuhua Xu
- Subjects
Biotechnology ,TP248.13-248.65 ,Genetics ,QH426-470 - Abstract
Abstract Background Recent advances in genomic technologies have facilitated genome-wide investigation of human genetic variations. However, most efforts have focused on the major populations, yet trio genomes of indigenous populations from Southeast Asia have been under-investigated. Results We analyzed the whole-genome deep sequencing data (~ 30×) of five native trios from Peninsular Malaysia and North Borneo, and characterized the genomic variants, including single nucleotide variants (SNVs), small insertions and deletions (indels) and copy number variants (CNVs). We discovered approximately 6.9 million SNVs, 1.2 million indels, and 9000 CNVs in the 15 samples, of which 2.7% SNVs, 2.3% indels and 22% CNVs were novel, implying the insufficient coverage of population diversity in existing databases. We identified a higher proportion of novel variants in the Orang Asli (OA) samples, i.e., the indigenous people from Peninsular Malaysia, than that of the North Bornean (NB) samples, likely due to more complex demographic history and long-time isolation of the OA groups. We used the pedigree information to identify de novo variants and estimated the autosomal mutation rates to be 0.81 × 10− 8 – 1.33 × 10− 8, 1.0 × 10− 9 – 2.9 × 10− 9, and ~ 0.001 per site per generation for SNVs, indels, and CNVs, respectively. The trio-genomes also allowed for haplotype phasing with high accuracy, which serves as references to the future genomic studies of OA and NB populations. In addition, high-frequency inherited CNVs specific to OA or NB were identified. One example is a 50-kb duplication in DEFA1B detected only in the Negrito trios, implying plausible effects on host defense against the exposure of diverse microbial in tropical rainforest environment of these hunter-gatherers. The CNVs shared between OA and NB groups were much fewer than those specific to each group. Nevertheless, we identified a 142-kb duplication in AMY1A in all the 15 samples, and this gene is associated with the high-starch diet. Moreover, novel insertions shared with archaic hominids were identified in our samples. Conclusion Our study presents a full catalogue of the genome variants of the native Malaysian populations, which is a complement of the genome diversity in Southeast Asians. It implies specific population history of the native inhabitants, and demonstrated the necessity of more genome sequencing efforts on the multi-ethnic native groups of Malaysia and Southeast Asia.
- Published
- 2019
- Full Text
- View/download PDF
3. Differentiated demographic histories and local adaptations between Sherpas and Tibetans
- Author
-
Chao Zhang, Yan Lu, Qidi Feng, Xiaoji Wang, Haiyi Lou, Jiaojiao Liu, Zhilin Ning, Kai Yuan, Yuchen Wang, Ying Zhou, Lian Deng, Lijun Liu, Yajun Yang, Shilin Li, Lifeng Ma, Zhiying Zhang, Li Jin, Bing Su, Longli Kang, and Shuhua Xu
- Subjects
Sherpa ,Tibetan ,Next-generation sequencing ,High-altitude adaptation ,Population history ,Gene flow ,Biology (General) ,QH301-705.5 ,Genetics ,QH426-470 - Abstract
Abstract Background The genetic relationships reported by recent studies between Sherpas and Tibetans are controversial. To gain insights into the population history and the genetic basis of high-altitude adaptation of the two groups, we analyzed genome-wide data in 111 Sherpas (Tibet and Nepal) and 177 Tibetans (Tibet and Qinghai), together with available data from present-day human populations. Results Sherpas and Tibetans show considerable genetic differences and can be distinguished as two distinct groups, even though the divergence between them (~3200–11,300 years ago) is much later than that between Han Chinese and either of the two groups (~6200–16,000 years ago). Sub-population structures exist in both Sherpas and Tibetans, corresponding to geographical or linguistic groups. Differentiation of genetic variants between Sherpas and Tibetans associated with adaptation to either high-altitude or ultraviolet radiation were identified and validated by genotyping additional Sherpa and Tibetan samples. Conclusions Our analyses indicate that both Sherpas and Tibetans are admixed populations, but the findings do not support the previous hypothesis that Tibetans derive their ancestry from Sherpas and Han Chinese. Compared to Tibetans, Sherpas show higher levels of South Asian ancestry, while Tibetans show higher levels of East Asian and Central Asian/Siberian ancestry. We propose a new model to elucidate the differentiated demographic histories and local adaptations of Sherpas and Tibetans.
- Published
- 2017
- Full Text
- View/download PDF
4. A map of copy number variations in Chinese populations.
- Author
-
Haiyi Lou, Shilin Li, Yajun Yang, Longli Kang, Xin Zhang, Wenfei Jin, Bailin Wu, Li Jin, and Shuhua Xu
- Subjects
Medicine ,Science - Abstract
It has been shown that the human genome contains extensive copy number variations (CNVs). Investigating the medical and evolutionary impacts of CNVs requires the knowledge of locations, sizes and frequency distribution of them within and between populations. However, CNV study of Chinese minorities, which harbor the majority of genetic diversity of Chinese populations, has been underrepresented considering the same efforts in other populations. Here we constructed, to our knowledge, a first CNV map in seven Chinese populations representing the major linguistic groups in China with 1,440 CNV regions identified using Affymetrix SNP 6.0 Array. Considerable differences in distributions of CNV regions between populations and substantial population structures were observed. We showed that ∼35% of CNV regions identified in minority ethnic groups are not shared by Han Chinese population, indicating that the contribution of the minorities to genetic architecture of Chinese population could not be ignored. We further identified highly differentiated CNV regions between populations. For example, a common deletion in Dong and Zhuang (44.4% and 50%), which overlaps two keratin-associated protein genes contributing to the structure of hair fibers, was not observed in Han Chinese. Interestingly, the most differentiated CNV deletion between HapMap CEU and YRI containing CCL3L1 gene reported in previous studies was also the highest differentiated regions between Tibetan and other populations. Besides, by jointly analyzing CNVs and SNPs, we found a CNV region containing gene CTDSPL were in almost perfect linkage disequilibrium between flanking SNPs in Tibetan while not in other populations except HapMap CHD. Furthermore, we found the SNP taggability of CNVs in Chinese populations was much lower than that in European populations. Our results suggest the necessity of a full characterization of CNVs in Chinese populations, and the CNV map we constructed serves as a useful resource in further evolutionary and medical studies.
- Published
- 2011
- Full Text
- View/download PDF
5. Cross-modal image segmentation and synthesis in medical imaging
- Author
-
Xiaolong Wu, Zhimin Gan, Jiaxi Xie, Haiyi Lou, Zhengxing Yan, Qiqi Jia, and Yingchun Yuan
- Published
- 2023
6. PGG.SV: a whole-genome-sequencing-based structural variant resource and data analysis platform
- Author
-
Yimin, Wang, Yunchao, Ling, Jiao, Gong, Xiaohan, Zhao, Hanwen, Zhou, Bo, Xie, Haiyi, Lou, Xinhao, Zhuang, Li, Jin, Shaohua, Fan, Guoqing, Zhang, and Shuhua, Xu
- Subjects
Genetics - Abstract
Structural variations (SVs) play important roles in human evolution and diseases, but there is a lack of data resources concerning representative samples, especially for East Asians. Taking advantage of both next-generation sequencing and third-generation sequencing data at the whole-genome level, we developed the database PGG.SV to provide a practical platform for both regionally and globally representative structural variants. In its current version, PGG.SV archives 584 277 SVs obtained from whole-genome sequencing data of 6048 samples, including 1030 long-read sequencing genomes representing 177 global populations. PGG.SV provides (i) high-quality SVs with fine-scale and precise genomic locations in both GRCh37 and GRCh38, covering underrepresented SVs in existing sequencing and microarray data; (ii) hierarchical estimation of SV prevalence in geographical populations; (iii) informative annotations of SV-related genes, potential functions and clinical effects; (iv) an analysis platform to facilitate SV-based case-control association studies and (v) various visualization tools for understanding the SV structures in the human genome. Taken together, PGG.SV provides a user-friendly online interface, easy-to-use analysis tools and a detailed presentation of results. PGG.SV is freely accessible via https://www.biosino.org/pggsv.
- Published
- 2022
7. Structural evolution of trypsinogen gene redundancy confers risk for pancreas diseases
- Author
-
Haiyi Lou, Yimin Wang, Bo Xie, Xinyue Bai, Yang Gao, Rui Zhang, and Shuhua Xu
- Abstract
Trypsin is an important enzyme secreted by the pancreas for digesting proteins. The precursors of major human trypsin are encoded by trypsinogen genes PRSS1 and PRSS2. Here, we leveraged multi-omic data to study their evolutionary and functional impact. We estimated that the primate trypsinogen gene was duplicated from a single copy to multiple-copy 24-34 million years ago (Mya). Compared to six protein-coding genes in non-human great apes, the human ancestral state was a 5-copy with three being pseudogenized. Interestingly, a derived 3-copy form emerged in Africans ∼260 Kya and dominated in non-Africans as one of the two major haplotypes. Although no longer encoding proteins, the pseudogene enhancers still function on pancreatic PRSS2 expression, leading to ∼15% up-regulation for the 5-copy than the 3-copy haplotype. Notably, the 3-copy structure was under positive selection in East Asians, where lower trypsin might be adaptive during high-starch diet shift for protecting the pancreas from autodigestion, as also supported by the identified causality of the haplotype structure to pancreatitis risk. Our efforts in elucidating the structural evolution of trypsinogen genes advance our understanding of the genetic basis and molecular mechanism of human pancreas diseases.
- Published
- 2022
8. Improved NGS variant calling tool for the
- Author
-
Haiyi, Lou, Bo, Xie, Yimin, Wang, Yang, Gao, and Shuhua, Xu
- Subjects
Pancreatitis, Chronic ,Mutation ,Trypsinogen ,Humans ,Trypsin ,Genetic Predisposition to Disease - Published
- 2022
9. Refining models of archaic admixture in Eurasia with ArchaicSeeker 2.0
- Author
-
Jiaojiao Liu, Yang Gao, Xixian Ma, Taoyang Wu, Rui Zhang, Kai Yuan, Haiyi Lou, Xueling Ge, Xumin Ni, Lian Deng, Chang Liu, Yuwen Pan, and Shuhua Xu
- Subjects
Asia ,Population genetics ,Science ,Quantitative Trait Loci ,General Physics and Astronomy ,Introgression ,Evolutionary biology ,Biology ,Genetic Introgression ,Polymorphism, Single Nucleotide ,Article ,Evolutionary genetics ,General Biochemistry, Genetics and Molecular Biology ,Animals ,Humans ,Lung function ,Neanderthals ,Multidisciplinary ,Models, Genetic ,Genome, Human ,Hominidae ,General Chemistry ,Genome evolution ,DNA-Binding Proteins ,Europe ,Siberia ,Biological dispersal ,Metagenomics ,Single episode ,Software ,Algorithms - Abstract
We developed a method, ArchaicSeeker 2.0, to identify introgressed hominin sequences and model multiple-wave admixture. The new method enabled us to discern two waves of introgression from both Denisovan-like and Neanderthal-like hominins in present-day Eurasian populations and an ancient Siberian individual. We estimated that an early Denisovan-like introgression occurred in Eurasia around 118.8–94.0 thousand years ago (kya). In contrast, we detected only one single episode of Denisovan-like admixture in indigenous peoples eastern to the Wallace-Line. Modeling ancient admixtures suggested an early dispersal of modern humans throughout Asia before the Toba volcanic super-eruption 74 kya, predating the initial peopling of Asia as proposed by the traditional Out-of-Africa model. Survived archaic sequences are involved in various phenotypes including immune and body mass (e.g., ZNF169), cardiovascular and lung function (e.g., HHAT), UV response and carbohydrate metabolism (e.g., HYAL1/HYAL2/HYAL3), while “archaic deserts” are enriched with genes associated with skin development and keratinization., Existing methods to identify the presence of DNA from other hominin species can be limited in the ability to accurately estimate introgression waves, or can only be applied to specific populations. Here, the authors have developed a generalizable method to identify introgression in multi-wave situations.
- Published
- 2021
10. Improved NGS variant calling tool for the PRSS1-PRSS2 locus.
- Author
-
Haiyi Lou, Bo Xie, Yimin Wang, Yang Gao, and Shuhua Xu
- Subjects
LIFE sciences ,GENETIC databases ,SINGLE nucleotide polymorphisms ,LOCUS (Genetics) ,BIOLOGICAL evolution ,NECROTIZING pancreatitis ,CHRONIC pancreatitis - Published
- 2023
- Full Text
- View/download PDF
11. De novo assembly of a Tibetan genome and identification of novel structural variants associated with high-altitude adaptation
- Author
-
Xuebin Qi, Yongbo Guo, Bin Li, Zhilin Ning, Shuhua Xu, Yang Gao, Baimakangzhuo, Dejiquzong, Gonggalanzi, Xiaoji Wang, Shiming Liu, Tianyi Wu, Chaoying Cui, Ouzhuluobu, Wangshan Zheng, Lian Deng, Jun Li, Haiyi Lou, Duojizhuoma, Bing Su, Yaoxi He, Bianba, and Caijuan Bai
- Subjects
0106 biological sciences ,AcademicSubjects/SCI00010 ,Population ,Sequence assembly ,Biology ,010603 evolutionary biology ,01 natural sciences ,Genome ,03 medical and health sciences ,genetic adaptation ,education ,Gene ,Denisovan ,reference genome ,030304 developmental biology ,0303 health sciences ,education.field_of_study ,Multidisciplinary ,Contig ,Molecular Biology & Genetics ,structural variants ,biology.organism_classification ,Evolutionary biology ,long-read sequencing ,Human genome ,AcademicSubjects/MED00010 ,Reference genome ,Research Article ,Tibetan - Abstract
Structural variants (SVs) may play important roles in human adaptation to extreme environments such as high altitude but have been under-investigated. Here, combining long-read sequencing with multiple scaffolding techniques, we assembled a high-quality Tibetan genome (ZF1), with a contig N50 length of 24.57 mega-base pairs (Mb) and a scaffold N50 length of 58.80 Mb. The ZF1 assembly filled 80 remaining N-gaps (0.25 Mb in total length) in the reference human genome (GRCh38). Markedly, we detected 17 900 SVs, among which the ZF1-specific SVs are enriched in GTPase activity that is required for activation of the hypoxic pathway. Further population analysis uncovered a 163-bp intronic deletion in the MKL1 gene showing large divergence between highland Tibetans and lowland Han Chinese. This deletion is significantly associated with lower systolic pulmonary arterial pressure, one of the key adaptive physiological traits in Tibetans. Moreover, with the use of the high-quality de novo assembly, we observed a much higher rate of genome-wide archaic hominid (Altai Neanderthal and Denisovan) shared non-reference sequences in ZF1 (1.32%–1.53%) compared to other East Asian genomes (0.70%–0.98%), reflecting a unique genomic composition of Tibetans. One such archaic hominid shared sequence—a 662-bp intronic insertion in the SCUBE2 gene—is enriched and associated with better lung function (the FEV1/FVC ratio) in Tibetans. Collectively, we generated the first high-resolution Tibetan reference genome, and the identified SVs may serve as valuable resources for future evolutionary and medical studies.
- Published
- 2019
12. Prioritizing natural-selection signals from the deep-sequencing genomic data suggests multi-variant adaptation in Tibetan highlanders
- Author
-
Yan Lu, Asifullah Khan, Chao Zhang, Yajun Yang, Xiaoxi Zhang, Dongsheng Lu, Jiaojiao Liu, Yuan Yuan, Xueling Ge, Lei Tian, Jian Yang, Qidi Feng, Zi-Bing Jin, Haiyi Lou, Yang Gao, Longli Kang, Fan Lu, Kai Yuan, Jia Qu, Lian Deng, Bing Su, Hao Chen, Yuwen Pan, Shuhua Xu, Yaoxi He, and Xiaoji Wang
- Subjects
Quantitative trait locus ,Biology ,archaic ancestry ,expression quantitative traits loci (eQTL) ,Deep sequencing ,hemoglobin concentration ,03 medical and health sciences ,0302 clinical medicine ,Missense mutation ,Allele ,next-generation sequencing (NGS) ,Gene ,030304 developmental biology ,Genetics ,0303 health sciences ,Multidisciplinary ,Natural selection ,Molecular Biology & Genetics ,hypoxia ,high-altitude adaptation ,EPAS1 ,adaptive genetic variant ,Epistasis ,tissue-specific expression ,030217 neurology & neurosurgery ,Research Article ,Tibetan - Abstract
Human genetic adaptation to high altitudes (>2500 m) has been extensively studied over the last few years, but few functional adaptive genetic variants have been identified, largely owing to the lack of deep-genome sequencing data available to previous studies. Here, we build a list of putative adaptive variants, including 63 missense, 7 loss-of-function, 1,298 evolutionarily conserved variants and 509 expression quantitative traits loci. Notably, the top signal of selection is located in TMEM247, a transmembrane protein-coding gene. The Tibetan version of TMEM247 harbors one high-frequency (76.3%) missense variant, rs116983452 (c.248C > T; p.Ala83Val), with the T allele derived from archaic ancestry and carried by >94% of Tibetans but absent or in low frequencies (
- Published
- 2019
13. Haplotype-resolved de novo assembly of a Tujia genome suggests the necessity for high-quality population-specific genome references
- Author
-
Haiyi, Lou, Yang, Gao, Bo, Xie, Yimin, Wang, Haikuan, Zhang, Miao, Shi, Sen, Ma, Xiaoxi, Zhang, Chang, Liu, and Shuhua, Xu
- Subjects
Histology ,Asian People ,Haplotypes ,Genome, Human ,Ethnicity ,Trypsinogen ,Humans ,Trypsin ,Cell Biology ,Minority Groups ,Pathology and Forensic Medicine - Abstract
Even though the human reference genome assembly is continually being improved, it remains debatable whether a population-specific reference is necessary for every ethnic group. Here, we de novo assembled an individual genome (TJ1) from the Tujia population, an ethnic minority group most closely related to the Han Chinese. TJ1 provided a high-quality haplotype-resolved assembly of chromosome-scale with a scaffold N50 size78 Mb. Compared with GRCh38 and other de novo assemblies, TJ1 improved short-read mapping, enhanced calling precision for structural variants, and detected rare and low-frequency variants. This revealed fine-scale differences between the closely related Han and Tujia populations, such as population-stratified variants of LCT and UBXN8, and improved screening for ancestry informative markers. We demonstrated that TJ1 could reduce false positives in clinical diagnosis and analyzed the PRSS1-PRSS2 locus as a test case. Our results suggest that population-specific assemblies are necessary for genetic and medical analysis, especially when closely related populations are studied. A record of this paper's transparent peer review process is included in the supplemental information.
- Published
- 2022
14. Genomic dissection of population substructure of Han Chinese and its implication in association studies
- Author
-
Shuhua Xu, Xianyong Yin, Shilin Li, Wenfei Jin, Haiyi Lou, Ling Yang, Xiaohong Gong, Hongyan Wang, Yiping Shen, Xuedong Pan, Yungang He, Yajun Yang, Yi Wang, Wenqing Fu, Yu An, Jiucun Wang, Jingze Tan, Ji Qian, Xiaoli Chen, Xin Zhang, Yangfei Sun, Xuejun Zhang, Bailin Wu, and Li Jin
- Subjects
Genomics -- Analysis ,Population genetics -- Analysis ,Single nucleotide polymorphisms -- Usage ,Biological sciences - Abstract
Several genome-wide association studies are conducted to analyze the population diversity and substructures observed in Han Chinese, the largest ethnic group of the world. The various genetic differences observed in the different clusters of the population are also explained.
- Published
- 2009
15. Genome-wide comparison of allele-specific gene expression between African and European populations
- Author
-
Asifullah Khan, Zhilin Ning, Chao Zhang, Shuhua Xu, Lei Tian, Kai Yuan, Haiyi Lou, and Yuan Yuan
- Subjects
0301 basic medicine ,animal structures ,Quantitative Trait Loci ,Black People ,Single-nucleotide polymorphism ,Biology ,Polymorphism, Single Nucleotide ,Genome ,White People ,03 medical and health sciences ,Gene Frequency ,Genotype ,Genetics ,Humans ,SNP ,Allele ,Molecular Biology ,Gene ,Genetics (clinical) ,Sequence Analysis, RNA ,Haplotype ,Genetic Variation ,General Medicine ,HSP40 Heat-Shock Proteins ,Phenotype ,030104 developmental biology ,Gene Expression Regulation ,Haplotypes ,Transcriptome ,Genome-Wide Association Study - Abstract
Transcriptomic diversity across human populations reflects differential regulatory mechanisms. Allelic-imbalanced gene expression is a genetic regulatory mechanism that contributes to human phenotypic variation. To systematically investigate genome-wide allele-specific expression (ASE), we analyzed RNA-Seq data from European and African populations provided by the Geuvadis project. We identified 11 sites in 8 genes showing ASE in both Europeans and Africans, and 9 sites in 9 genes showing population-specific ASE, including both novel and known ASE signals. Notably, the top signal of differentiated ASE between inter-continental populations was observed in DNAJC15, of which the derived allele of rs12015, a single nucleotide polymorphism (SNP), showed significantly higher expression than did the ancestral allele specifically in European individuals. We identified a unique haplotype of DNAJC15, where a few SNPs highly differentiated between European and African populations were strongly linked to sites with high ASE. Among these, SNP rs17553284 affected the binding of several transcription factors as well as the genotype-dependent expression of DNAJC15. Therefore, we speculated that rs17553284 could be a regulatory causal variant that mediates the ASE of rs12015. We found several variations in ASE between intercontinental populations. The highly differentiated ASE genes identified here may implicate in the phenotypic variations among populations that are both evolutionarily and medically important.
- Published
- 2018
16. A metagenomic approach to dissect the genetic composition of enterotypes in Han Chinese and two Muslim groups
- Author
-
Jing Li, Shuhua Xu, Hans-Peter Horz, Shijie Zheng, Ruiqing Fu, Yajun Yang, Yan Lu, Hongjiao Liu, Meng Shi, Kun Tang, Yaqun Guan, Haiyi Lou, Lei Tian, and Sijia Wang
- Subjects
DNA, Bacterial ,0301 basic medicine ,030106 microbiology ,Single-nucleotide polymorphism ,Genome-wide association study ,Gut flora ,DNA, Ribosomal ,Islam ,Polymorphism, Single Nucleotide ,Applied Microbiology and Biotechnology ,Microbiology ,03 medical and health sciences ,Asian People ,RNA, Ribosomal, 16S ,Genetic variation ,Ethnicity ,Cluster Analysis ,Humans ,Microbiome ,Genetic Association Studies ,Phylogeny ,Ecology, Evolution, Behavior and Systematics ,Genetics ,Bacteria ,biology ,Microbiota ,Sequence Analysis, DNA ,biology.organism_classification ,Healthy Volunteers ,Gastrointestinal Microbiome ,030104 developmental biology ,Metagenomics ,Enterotype ,Lysophospholipase ,Human Microbiome Project - Abstract
Distinct enterotypes have been observed in the human gut but little is known about the genetic basis of the microbiome. Moreover, it is not clear how many genetic differences exist between enterotypes within or between populations. In this study, both the 16S rRNA gene and the metagenomes of the gut microbiota were sequenced from 48 Han Chinese, 48 Kazaks, and 96 Uyghurs, and taxonomies were assigned after de novo assembly. Single nucleotide polymorphisms were also identified by referring to data from the Human Microbiome Project. Systematic analysis of the gut communities in terms of their abundance and genetic composition was also performed, together with a genome-wide association study of the host genomes. The gut microbiota of 192 subjects was clearly classified into two enterotypes (Bacteroides and Prevotella). Interestingly, both enterotypes showed a clear genetic differentiation in terms of their functional catalogue of genes, especially for genes involved in amino acid and carbohydrate metabolism. In addition, several differentiated genera and genes were found among the three populations. Notably, one human variant (rs878394) was identified that showed significant association with the abundance of Prevotella, which is linked to LYPLAL1, a gene associated with body fat distribution, the waist-hip ratio and insulin sensitivity. Taken together, considerable differentiation was observed in gut microbes between enterotypes and among populations that was reflected in both the taxonomic composition and the genetic makeup of their functional genes, which could have been influenced by a variety of factors, such as diet and host genetic variation.
- Published
- 2018
17. Assessing genome-wide copy number variation in the Han Chinese population
- Author
-
Yan Lu, Ruiqing Fu, Xi Zhang, Jingning Wei, Yajun Yang, Feng Zhang, Chao Zhang, Shi Yan, Changhua Li, Baijun Fang, Dongsheng Lu, Li Jin, Fangfang Pu, Qian Wei, Xiaoji Wang, Zhendong Wu, Jianqi Lu, Haiyi Lou, and Shuhua Xu
- Subjects
Male ,0301 basic medicine ,China ,DNA Copy Number Variations ,Population ,030105 genetics & heredity ,Biology ,Genome ,DNA sequencing ,Deep sequencing ,03 medical and health sciences ,Asian People ,Ethnicity ,Genetics ,Humans ,Copy-number variation ,1000 Genomes Project ,education ,Genetics (clinical) ,Genetic diversity ,education.field_of_study ,Genome, Human ,Genetic Variation ,030104 developmental biology ,Evolutionary biology ,Human genome - Abstract
Background Copy number variation (CNV) is a valuable source of genetic diversity in the human genome and a well-recognised cause of various genetic diseases. However, CNVs have been considerably under-represented in population-based studies, particularly the Han Chinese which is the largest ethnic group in the world. Objectives To build a representative CNV map for the Han Chinese population. Methods We conducted a genome-wide CNV study involving 451 male Han Chinese samples from 11 geographical regions encompassing 28 dialect groups, representing a less-biased panel compared with the currently available data. We detected CNVs by using 4.2M NimbleGen comparative genomic hybridisation array and whole-genome deep sequencing of 51 samples to optimise the filtering conditions in CNV discovery. Results A comprehensive Han Chinese CNV map was built based on a set of high-quality variants (positive predictive value >0.8, with sizes ranging from 369 bp to 4.16 Mb and a median of 5907 bp). The map consists of 4012 CNV regions (CNVRs), and more than half are novel to the 30 East Asian CNV Project and the 1000 Genomes Project Phase 3. We further identified 81 CNVRs specific to regional groups, which was indicative of the subpopulation structure within the Han Chinese population. Conclusions Our data are complementary to public data sources, and the CNV map may facilitate in the identification of pathogenic CNVs and further biomedical research studies involving the Han Chinese population.
- Published
- 2017
18. CNVbase: Batch identification of novel and rare copy number variations based on multi-ethnic population data
- Author
-
Renqian Du, Cheng Zhang, Yiping Shen, Haiyi Lou, Shuhua Xu, Feng Zhang, Li Jin, and Jianqi Lu
- Subjects
0301 basic medicine ,Whole genome sequencing ,Internet ,DNA Copy Number Variations ,Whole Genome Sequencing ,business.industry ,MEDLINE ,Ethnic group ,Computational biology ,Biology ,03 medical and health sciences ,030104 developmental biology ,0302 clinical medicine ,Databases, Genetic ,Ethnicity ,Genetics ,Population data ,Humans ,The Internet ,Identification (biology) ,Copy-number variation ,business ,Molecular Biology ,030217 neurology & neurosurgery - Published
- 2017
19. Differentiated demographic histories and local adaptations between Sherpas and Tibetans
- Author
-
Yajun Yang, Yan Lu, Shilin Li, Lifeng Ma, Shuhua Xu, Lijun Liu, Zhilin Ning, Yuchen Wang, Haiyi Lou, Chao Zhang, Li Jin, Xiaoji Wang, Longli Kang, Kai Yuan, Qidi Feng, Jiaojiao Liu, Zhiying Zhang, Lian Deng, Bing Su, and Ying Zhou
- Subjects
0301 basic medicine ,Han chinese ,South asia ,Genotype ,lcsh:QH426-470 ,Acclimatization ,Population ,Ethnic group ,Biology ,Altitude Sickness ,Tibet ,03 medical and health sciences ,Asian People ,Population history ,Ethnicity ,Humans ,High-altitude adaptation ,education ,Ultraviolet radiation ,lcsh:QH301-705.5 ,History, Ancient ,education.field_of_study ,Altitude ,Research ,Genetic variants ,Genetic Variation ,Adaptation, Physiological ,Gene flow ,lcsh:Genetics ,030104 developmental biology ,Genetics, Population ,Haplotypes ,lcsh:Biology (General) ,Evolutionary biology ,Next-generation sequencing ,Sherpa ,Demography ,Tibetan - Abstract
Background The genetic relationships reported by recent studies between Sherpas and Tibetans are controversial. To gain insights into the population history and the genetic basis of high-altitude adaptation of the two groups, we analyzed genome-wide data in 111 Sherpas (Tibet and Nepal) and 177 Tibetans (Tibet and Qinghai), together with available data from present-day human populations. Results Sherpas and Tibetans show considerable genetic differences and can be distinguished as two distinct groups, even though the divergence between them (~3200–11,300 years ago) is much later than that between Han Chinese and either of the two groups (~6200–16,000 years ago). Sub-population structures exist in both Sherpas and Tibetans, corresponding to geographical or linguistic groups. Differentiation of genetic variants between Sherpas and Tibetans associated with adaptation to either high-altitude or ultraviolet radiation were identified and validated by genotyping additional Sherpa and Tibetan samples. Conclusions Our analyses indicate that both Sherpas and Tibetans are admixed populations, but the findings do not support the previous hypothesis that Tibetans derive their ancestry from Sherpas and Han Chinese. Compared to Tibetans, Sherpas show higher levels of South Asian ancestry, while Tibetans show higher levels of East Asian and Central Asian/Siberian ancestry. We propose a new model to elucidate the differentiated demographic histories and local adaptations of Sherpas and Tibetans. Electronic supplementary material The online version of this article (doi:10.1186/s13059-017-1242-y) contains supplementary material, which is available to authorized users.
- Published
- 2017
20. Analysis of five deep-sequenced trio-genomes of the Peninsular Malaysia Orang Asli and North Borneo populations
- Author
-
Bhooma Thiruvahindrapuram, Yik Ying Teo, Xiaoxi Zhang, Bhak Jong, Lai-Ping Wong, Christian R. Marshall, Wanxing Xu, Subbiah Vijay Kumar, Abdul Rahman Thuhairah, Aghakhanian Farhang, Maude E. Phipps, Boon Peng Hoh, Lian Deng, Chee Wei Yew, Stephen W. Scherer, Dongsheng Lu, Chang Liu, Shuhua Xu, Rick Twee-Hee Ong, Mohammad Zahirul Hoque, Haiyi Lou, and Bo Xie
- Subjects
lcsh:QH426-470 ,DNA Copy Number Variations ,Demographic history ,lcsh:Biotechnology ,Locus (genetics) ,Biology ,Genome ,DNA sequencing ,03 medical and health sciences ,0302 clinical medicine ,INDEL Mutation ,Mutation Rate ,Borneo ,lcsh:TP248.13-248.65 ,Genetic variation ,Genetics ,Animals ,Humans ,Copy-number variation ,Indel ,030304 developmental biology ,0303 health sciences ,Genome, Human ,Haplotype ,Malaysia ,Genetic Variation ,High-Throughput Nucleotide Sequencing ,Hominidae ,lcsh:Genetics ,Evolutionary biology ,030217 neurology & neurosurgery ,Biotechnology ,Research Article - Abstract
BackgroundRecent advances in genomic technologies have facilitated genome-wide investigation of human genetic variations. However, most efforts have focused on the major populations, yet trio genomes of indigenous populations from Southeast Asia have been under-investigated.ResultsWe analyzed the whole-genome deep sequencing data (~ 30×) of five native trios from Peninsular Malaysia and North Borneo, and characterized the genomic variants, including single nucleotide variants (SNVs), small insertions and deletions (indels) and copy number variants (CNVs). We discovered approximately 6.9 million SNVs, 1.2 million indels, and 9000 CNVs in the 15 samples, of which 2.7% SNVs, 2.3% indels and 22% CNVs were novel, implying the insufficient coverage of population diversity in existing databases. We identified a higher proportion of novel variants in the Orang Asli (OA) samples, i.e., the indigenous people from Peninsular Malaysia, than that of the North Bornean (NB) samples, likely due to more complex demographic history and long-time isolation of the OA groups. We used the pedigree information to identify de novo variants and estimated the autosomal mutation rates to be 0.81 × 10− 8– 1.33 × 10− 8, 1.0 × 10− 9– 2.9 × 10− 9, and ~ 0.001 per site per generation for SNVs, indels, and CNVs, respectively. The trio-genomes also allowed for haplotype phasing with high accuracy, which serves as references to the future genomic studies of OA and NB populations. In addition, high-frequency inherited CNVs specific to OA or NB were identified. One example is a 50-kb duplication inDEFA1Bdetected only in the Negrito trios, implying plausible effects on host defense against the exposure of diverse microbial in tropical rainforest environment of these hunter-gatherers. The CNVs shared between OA and NB groups were much fewer than those specific to each group. Nevertheless, we identified a 142-kb duplication inAMY1Ain all the 15 samples, and this gene is associated with the high-starch diet. Moreover, novel insertions shared with archaic hominids were identified in our samples.ConclusionOur study presents a full catalogue of the genome variants of the native Malaysian populations, which is a complement of the genome diversity in Southeast Asians. It implies specific population history of the native inhabitants, and demonstrated the necessity of more genome sequencing efforts on the multi-ethnic native groups of Malaysia and Southeast Asia.
- Published
- 2019
21. De novo assembly of a Tibetan genome and identification of novel structural variants associated with high altitude adaptation
- Author
-
Gonggalanzi, Yaoxi He, Yongbo Guo, Shiming Liu, Lian Deng, Wangshan Zheng, Tianyi Wu, Chaoying Cui, Baimakangzhuo, Yang Gao, Xiaoji Wang, Jun Li, Bin Li, Haiyi Lou, Bing Su, Ouzhuluobu, Dejiquzong, Caijuan Bai, Duojizhuoma, Bianba, Zhilin Ning, Xuebin Qi, and Shuhua Xu
- Subjects
education.field_of_study ,biology ,Contig ,Evolutionary biology ,Population ,Sequence assembly ,Human genome ,education ,biology.organism_classification ,Gene ,Denisovan ,Genome ,Reference genome - Abstract
Structural variants (SVs) may play important roles in human adaption to extreme environments such as high altitude but have been under-investigated. Here, combining long-read sequencing with multiple scaffolding techniques, we assembled a high-quality Tibetan genome (ZF1), with a contig N50 length of 24.57 mega-base pairs (Mb) and a scaffold N50 length of 58.80 Mb. The ZF1 assembly filled 80 remaining N-gaps (0.25 Mb in total length) in the reference human genome (GRCh38). Markedly, we detected 17,900 SVs, among which the ZF1-specific SVs are enriched in GTPase activity that is required for activation of the hypoxic pathway. Further population analysis uncovered a 163-bp intronic deletion in the MKL1 gene showing large divergence between highland Tibetans and lowland Han Chinese. This deletion is significantly associated with lower systolic pulmonary arterial pressure, one of the key adaptive physiological traits in Tibetans. Moreover, with the use of the high quality de novo assembly, we observed a much higher rate of genome-wide archaic hominid (Altai Neanderthal and Denisovan) shared non-reference sequences in ZF1 (1.32%-1.53%) compared to other East Asian genomes (0.70%-0.98%), reflecting a unique genomic composition of Tibetans. One such archaic-hominid shared sequence, a 662-bp intronic insertion in the SCUBE2 gene, is enriched and associated with better lung function (the FEV1/FVC ratio) in Tibetans. Collectively, we generated the first high-resolution Tibetan reference genome, and the identified SVs may serve as valuable resources for future evolutionary and medical studies.
- Published
- 2019
22. Analysis of Deep-sequenced Trio-genomes of the Peninsular Malaysia Orang Asli and North Borneo Populations
- Author
-
Lian Deng, Haiyi Lou, Xiaoxi Zhang, Thiruvahindrapuram Bhooma, Dongsheng Lu, Christian Marshall, Chang Liu, Bo Xie, Wanxing Xu, Lai-Ping Wong, Chee-Wei Yew, Aghakhanian Farhang, Rick Twee-Hee Ong, Mohammad Zahirul Hoque, Abdul Rahman Thuhairah, Bhak Jong, Maude E. Phipps, Stephen W. Scherer, Yik-Ying Teo, Subbiah Vijay Kumar, Boon Peng Hoh, and Shuhua Xu
- Abstract
Background Recent advances in genomic technologies have facilitated genome-wide investigation of human genetic variations. However, most efforts have focused on the major populations, yet trio genomes of indigenous populations from Southeast Asia have been under-investigated. Results We analyzed the whole-genome deep sequencing data (~30×) of five native trios from Malaysia, and discovered approximately 6.9 million single nucleotide variants (SNVs), 1.2 million small insertions and deletions (indels), and 9,000 copy number variants (CNVs) in the 15 samples. We found 3.9% SNVs, 4.7% indels and 22% CNVs were novel, implying the insufficient coverage of population diversity in existing databases. We identified a higher proportion of novel variants in the Orang Asli (OA) samples, i.e., the indigenous people from Peninsular Malaysia, than that of the North Bornean (NB) samples, likely due to more complex demographic history and long-time isolation of the OA groups. We used the pedigree information to identify de novo variants and estimated the mutation rates to be 0.81×10-8–1.33×10-8, 1.0×10-9–2.9×10-9, and ~0.001 per site per generation for SNVs, indels, and CNVs, respectively. The trio-genomes also allowed for accurate haplotype phasing with high accuracy, which serves as references to the future genomic studies of OA and NB populations. In addition, high-frequency inherited CNVs specific to OA or NB were identified. One example was a 50-kb duplication in DEFA1B detected only in the Negrito trios, implying plausible effects on host defense against the exposure of diverse microbial in tropical rainforest environment of these hunter-gatherers. The CNVs shared between OA and NB groups were much fewer than those specific to each group. Nevertheless, we identified a 142-kb duplication in AMY1A in all the 15 samples, and this gene is associated with the high-starch diet. Moreover, novel insertions shared with archaic hominids were identified in our samples. Conclusion Our study presents a full catalogue of the genome variants of the native Malaysian populations, which is a complement of the genome diversity in Southeast Asians. It implies specific population history of the native inhabitants, and demonstrated the necessity of more genome sequencing efforts on the multi-ethnic native groups of Malaysia and Southeast Asia.
- Published
- 2019
23. MOESM1 of Analysis of five deep-sequenced trio-genomes of the Peninsular Malaysia Orang Asli and North Borneo populations
- Author
-
Deng, Lian, Haiyi Lou, Xiaoxi Zhang, Bhooma Thiruvahindrapuram, Dongsheng Lu, Marshall, Christian, Liu, Chang, Xie, Bo, Wanxing Xu, Lai-Ping Wong, Chee-Wei Yew, Aghakhanian Farhang, Ong, Rick, Hoque, Mohammad, Thuhairah, Abdul, Bhak Jong, Phipps, Maude, Scherer, Stephen, Yik-Ying Teo, Subbiah Kumar, Boon-Peng Hoh, and Shuhua Xu
- Abstract
Additional file 1: Figure S1. Data quality of the de novo variants. Table S1 Sample information. Table S2. Summary information of genomic regions with top 1% of SNV density over the genome. Table S3. Summary information of genomic regions with top 1% of indel density over the genome. Table S5. Functional annotation of SNVs in each native population and global populations. Table S6. Functional annotation of indels in each native population and global populations. Table S7. Functional annotation of SNVs and indels in each native Malaysian genome. Table S9. Genomic regions identified as novel SNV hotspots. Table S10. Genomic regions identified as novel indel hotspots. Table S11. List of the de novo SNVs identified in each offspring. Table S12. List of the de novo indels identified in each offspring. Table S13. Summary of CNVs identified in each trio. Table S14. De novo CNVs identified in the 5 off-springs. Table S16. Inheritance of selected genes that are known to either lie on the segmental duplication regions, or carry CNVs.
- Published
- 2019
- Full Text
- View/download PDF
24. Ancestral Origins and Genetic History of Tibetan Highlanders
- Author
-
Ya Hu, Xiong Yang, Yajun Yang, Chao Zhang, Haiyi Lou, Dongsheng Lu, Shilin Li, Yan Lu, Longli Kang, Li Jin, Kai Yuan, Qidi Feng, Shuhua Xu, Xiaoji Wang, Yuchen Wang, Qiliang Ding, Yaqun Guan, Ying Zhou, Lian Deng, and Bing Su
- Subjects
0301 basic medicine ,Gene Flow ,Male ,China ,Neanderthal ,Population ,Oceania ,Tibet ,Article ,Gene flow ,03 medical and health sciences ,0302 clinical medicine ,Asian People ,Phylogenetics ,biology.animal ,Ethnicity ,Genetics ,Animals ,Humans ,East Asia ,Genetics(clinical) ,Selection, Genetic ,education ,Denisovan ,Genetics (clinical) ,Phylogeny ,Neanderthals ,education.field_of_study ,Natural selection ,biology ,Models, Genetic ,Genome, Human ,Altitude ,High-Throughput Nucleotide Sequencing ,Gene Pool ,biology.organism_classification ,030104 developmental biology ,Geography ,Genetics, Population ,Haplotypes ,Evolutionary biology ,Gene pool ,030217 neurology & neurosurgery - Abstract
The origin of Tibetans remains one of the most contentious puzzles in history, anthropology, and genetics. Analyses of deeply sequenced (30×–60×) genomes of 38 Tibetan highlanders and 39 Han Chinese lowlanders, together with available data on archaic and modern humans, allow us to comprehensively characterize the ancestral makeup of Tibetans and uncover their origins. Non-modern human sequences compose ∼6% of the Tibetan gene pool and form unique haplotypes in some genomic regions, where Denisovan-like, Neanderthal-like, ancient-Siberian-like, and unknown ancestries are entangled and elevated. The shared ancestry of Tibetan-enriched sequences dates back to ∼62,000–38,000 years ago, predating the Last Glacial Maximum (LGM) and representing early colonization of the plateau. Nonetheless, most of the Tibetan gene pool is of modern human origin and diverged from that of Han Chinese ∼15,000 to ∼9,000 years ago, which can be largely attributed to post-LGM arrivals. Analysis of ∼200 contemporary populations showed that Tibetans share ancestry with populations from East Asia (∼82%), Central Asia and Siberia (∼11%), South Asia (∼6%), and western Eurasia and Oceania (∼1%). Our results support that Tibetans arose from a mixture of multiple ancestral gene pools but that their origins are much more complicated and ancient than previously suspected. We provide compelling evidence of the co-existence of Paleolithic and Neolithic ancestries in the Tibetan gene pool, indicating a genetic continuity between pre-historical highland-foragers and present-day Tibetans. In particular, highly differentiated sequences harbored in highlanders’ genomes were most likely inherited from pre-LGM settlers of multiple ancestral origins (SUNDer) and maintained in high frequency by natural selection.
- Published
- 2016
- Full Text
- View/download PDF
25. Genome-wide scans reveal variants at EDAR predominantly affecting hair straightness in Han Chinese and Uyghur populations
- Author
-
Yajun Yang, Yan Lu, Kun Tang, Sijia Wang, Li Jin, Pardis C. Sabeti, Yu Liu, Yi Jiao, Yaqun Guan, Jinxi Li, Sijie Wu, Qianqian Peng, Shuhua Xu, Jean Krutmann, Dongsheng Lu, Haiyi Lou, Zhaoxia Zhang, Qidi Feng, Manfei Zhang, and Jingze Tan
- Subjects
Male ,0301 basic medicine ,China ,Candidate gene ,Population ,Genome-wide association study ,030105 genetics & heredity ,Biology ,Polymorphism, Single Nucleotide ,White People ,03 medical and health sciences ,Asian People ,Gene Frequency ,Intermediate Filament Proteins ,Polymorphism (computer science) ,Genetics ,Humans ,Genetic Predisposition to Disease ,Antigens ,education ,Allele frequency ,Genetics (clinical) ,education.field_of_study ,integumentary system ,Edar Receptor ,Haplotype ,Trichohyalin ,Human genetics ,Phenotype ,030104 developmental biology ,Haplotypes ,Female ,Genome-Wide Association Study ,Hair - Abstract
Hair straightness/curliness is one of the most conspicuous features of human variation and is particularly diverse among populations. A recent genome-wide scan found common variants in the Trichohyalin (TCHH) gene that are associated with hair straightness in Europeans, but different genes might affect this phenotype in other populations. By sampling 2899 Han Chinese, we performed the first genome-wide scan of hair straightness in East Asians, and found EDAR (rs3827760) as the predominant gene (P = 4.67 × 10−16), accounting for 3.66 % of the total variance. The candidate gene approach did not find further significant associations, suggesting that hair straightness may be affected by a large number of genes with subtle effects. Notably, genetic variants associated with hair straightness in Europeans are generally low in frequency in Han Chinese, and vice versa. To evaluate the relative contribution of these variants, we performed a second genome-wide scan in 709 samples from the Uyghur, an admixed population with both eastern and western Eurasian ancestries. In Uyghurs, both EDAR (rs3827760: P = 1.92 × 10−12) and TCHH (rs11803731: P = 1.46 × 10−3) are associated with hair straightness, but EDAR (OR 0.415) has a greater effect than TCHH (OR 0.575). We found no significant interaction between EDAR and TCHH (P = 0.645), suggesting that these two genes affect hair straightness through different mechanisms. Furthermore, haplotype analysis indicates that TCHH is not subject to selection. While EDAR is under strong selection in East Asia, it does not appear to be subject to selection after the admixture in Uyghurs. These suggest that hair straightness is unlikely a trait under selection.
- Published
- 2016
26. A 3.4-kb Copy-Number Deletion near EPAS1 Is Significantly Enriched in High-Altitude Tibetans but Absent from the Denisovan Sequence
- Author
-
Yan Lu, Ruiqing Fu, Li Jin, Yajun Yang, Qidi Feng, Shilin Li, Sijie Wu, Bing Su, Xiaoji Wang, Yaqun Guan, Dongsheng Lu, Longli Kang, Yeun-Jun Chung, Boon Peng Hoh, Haiyi Lou, and Shuhua Xu
- Subjects
Linkage disequilibrium ,Candidate gene ,DNA Copy Number Variations ,Sequence analysis ,Molecular Sequence Data ,Population ,Adaptation, Biological ,Biology ,Tibet ,Polymerase Chain Reaction ,Article ,Linkage Disequilibrium ,Evolution, Molecular ,Hemoglobins ,03 medical and health sciences ,0302 clinical medicine ,Basic Helix-Loop-Helix Transcription Factors ,Ethnicity ,Genetics ,Animals ,Humans ,Genetics(clinical) ,education ,Denisovan ,Gene ,Genetics (clinical) ,030304 developmental biology ,0303 health sciences ,education.field_of_study ,Base Sequence ,Microarray analysis techniques ,Altitude ,EPAS1 ,Hominidae ,Sequence Analysis, DNA ,Microarray Analysis ,biology.organism_classification ,Genetics, Population ,Algorithms ,030217 neurology & neurosurgery - Abstract
Tibetan high-altitude adaptation (HAA) has been studied extensively, and many candidate genes have been reported. Subsequent efforts targeting HAA functional variants, however, have not been that successful (e.g., no functional variant has been suggested for the top candidate HAA gene, EPAS1). With WinXPCNVer, a method developed in this study, we detected in microarray data a Tibetan-enriched deletion (TED) carried by 90% of Tibetans; 50% were homozygous for the deletion, whereas only 3% carried the TED and 0% carried the homozygous deletion in 2,792 worldwide samples (p < 10−15). We employed long PCR and Sanger sequencing technologies to determine the exact copy number and breakpoints of the TED in 70 additional Tibetan and 182 diverse samples. The TED had identical boundaries (chr2: 46,694,276–46,697,683; hg19) and was 80 kb downstream of EPAS1. Notably, the TED was in strong linkage disequilibrium (LD; r2 = 0.8) with EPAS1 variants associated with reduced blood concentrations of hemoglobin. It was also in complete LD with the 5-SNP motif, which was suspected to be introgressed from Denisovans, but the deletion itself was absent from the Denisovan sequence. Correspondingly, we detected that footprints of positive selection for the TED occurred 12,803 (95% confidence interval = 12,075–14,725) years ago. We further whole-genome deep sequenced (>60×) seven Tibetans and verified the TED but failed to identify any other copy-number variations with comparable patterns, giving this TED top priority for further study. We speculate that the specific patterns of the TED resulted from its own functionality in HAA of Tibetans or LD with a functional variant of EPAS1.
- Published
- 2015
27. Genetic History of Xinjiang's Uyghurs Suggests Bronze Age Multiple-Way Contacts in Eurasia
- Author
-
Ying Zhou, Shuhua Xu, Yajun Yang, Dongsheng Lu, Qidi Feng, Haiyi Lou, Yan Lu, Meng Shi, Kun Tang, Lei Tian, Sijia Wang, Kai Yuan, Xiong Yang, Jing Li, Chao Zhang, Yuchen Wang, Xi Zhang, Zhilin Ning, Xumin Ni, Asifullah Khan, Yaqun Guan, Chang Liu, and Xiaoji Wang
- Subjects
0301 basic medicine ,Gene Flow ,China ,South asia ,Range (biology) ,Genetic admixture ,Biology ,Polymorphism, Single Nucleotide ,White People ,03 medical and health sciences ,0302 clinical medicine ,Asian People ,Bronze Age ,Genetics ,Ethnicity ,Humans ,East Asia ,Molecular Biology ,Ecology, Evolution, Behavior and Systematics ,Geography ,Archaeology ,Phylogeography ,030104 developmental biology ,Genetics, Population ,Haplotypes ,Genetic structure ,030217 neurology & neurosurgery - Abstract
The Uyghur people residing in Xinjiang, a territory located in the far west of China and crossed by the Silk Road, are a key ethnic group for understanding the history of human dispersion in Eurasia. Here we assessed the genetic structure and ancestry of 951 Xinjiang's Uyghurs (XJU) representing 14 geographical subpopulations. We observed a southwest and northeast differentiation within XJU, which was likely shaped jointly by the Tianshan Mountains, which traverses from east to west as a natural barrier, and gene flow from both east and west directions. In XJU, we identified four major ancestral components that were potentially derived from two earlier admixed groups: one from the West, harboring European (25-37%) and South Asian ancestries (12-20%), and the other from the East, with Siberian (15-17%) and East Asian (29-47%) ancestries. By using a newly developed method, MultiWaver, the complex admixture history of XJU was modeled as a two-wave admixture. An ancient wave was dated back to ∼3,750 years ago (ya), which is much earlier than that estimated by previous studies, but fits within the range of dating of mummies that exhibited European features that were discovered in the Tarim basin, which is situated in southern Xinjiang (4,000-2,000 ya); a more recent wave occurred around 750 ya, which is in agreement with the estimate from a recent study using other methods. We unveiled a more complex scenario of ancestral origins and admixture history in XJU than previously reported, which further suggests Bronze Age massive migrations in Eurasia and East-West contacts across the Silk Road.
- Published
- 2017
28. A missense point mutation in COL10A1 identified with whole-genome deep sequencing in a 7-generation Pakistan dwarf family
- Author
-
Shuhua Xu, Yan Lu, Firdous Bukhari, Jiaojiao Liu, Chao Zhang, Ihtisham Bukhari, Saima Mustafa, Furhan Iqbal, Zhendong Wu, Ruiqing Fu, Xiong Yang, Muhammad Aslam, and Haiyi Lou
- Subjects
0301 basic medicine ,Adult ,Male ,Heterozygote ,Adolescent ,Mutation, Missense ,Dwarfism ,030105 genetics & heredity ,Biology ,Collagen Type XI ,Polymorphism, Single Nucleotide ,DNA sequencing ,Deep sequencing ,Article ,03 medical and health sciences ,symbols.namesake ,Consanguinity ,Young Adult ,Sequence Homology, Nucleic Acid ,Genetics ,medicine ,Missense mutation ,Humans ,Point Mutation ,Genetic Predisposition to Disease ,Pakistan ,Copy-number variation ,Child ,Genetics (clinical) ,Whole genome sequencing ,Sanger sequencing ,Family Health ,Base Sequence ,Whole Genome Sequencing ,Point mutation ,High-Throughput Nucleotide Sequencing ,Middle Aged ,medicine.disease ,Pedigree ,030104 developmental biology ,symbols ,Female - Abstract
Disease-associated variants in the human genome are continually being identified using DNA sequencing technologies that are especially effective for Mendelian disorders. Here we sequenced whole genome to high coverage (>30×) of 6 members of a 7-generation family with dwarfism from a consanguineous tribe in Pakistan to determine the causal variant(s). We identified a missense variant rs111033552 (c.2011T>C [p.Ser671Pro]) located in COL10A1 (encodes the alpha chain of type X collagen) as the most likely contributor to the dwarfism. We further confirmed the variant in 22 family members using Sanger sequencing. All affected individuals are heterozygous for the missense mutation rs111033552 and no individual homozygous was observed. Moreover, the mutation was absent in 69,985 individuals representing >150 global populations. Taking advantage of whole-genome sequencing data, we also examined other variant forms, including copy number variation and insertion/deletion, but failed to identify such variants enriched in the affected individuals. Thus rs111033552 had priority for linkage with dwarfism.
- Published
- 2017
29. Additional file 1: Figures S1â S35. of Differentiated demographic histories and local adaptations between Sherpas and Tibetans
- Author
-
Zhang, Chao, Lu, Yan, Qidi Feng, Xiaoji Wang, Haiyi Lou, Jiaojiao Liu, Zhilin Ning, Yuan, Kai, Yuchen Wang, Zhou, Ying, Deng, Lian, Lijun Liu, Yajun Yang, Shilin Li, Lifeng Ma, Zhiying Zhang, Jin, Li, Su, Bing, Longli Kang, and Shuhua Xu
- Abstract
and Table S1â S6. (PDF 35929 kb)
- Published
- 2017
- Full Text
- View/download PDF
30. Copy number variations and genetic admixtures in three Xinjiang ethnic minority groups
- Author
-
Xinwei Pan, Haiyi Lou, Wenfei Jin, Shilin Li, Huaigu Zhou, Yuan Ping, Dongsheng Lu, Li Jin, Shuhua Xu, and Ruiqing Fu
- Subjects
China ,Linkage disequilibrium ,DNA Copy Number Variations ,Population ,Ethnic group ,Genetic admixture ,Kazakh ,Biology ,Polymorphism, Single Nucleotide ,Linkage Disequilibrium ,Article ,Asian People ,Gene Frequency ,Ethnicity ,Genetics ,Humans ,Copy-number variation ,education ,Allele frequency ,Genetic Association Studies ,Genetics (clinical) ,education.field_of_study ,Genome, Human ,language.human_language ,Phylogeography ,Phenotype ,Evolutionary biology ,language ,SNP array - Abstract
Xinjiang is geographically located in central Asia, and it has played an important historical role in connecting eastern Eurasian (EEA) and western Eurasian (WEA) people. However, human population genomic studies in this region have been largely underrepresented, especially with respect to studies of copy number variations (CNVs). Here we constructed the first CNV map of the three major ethnic minority groups, the Uyghur, Kazakh and Kirgiz, using Affymetrix Genome-Wide Human SNP Array 6.0. We systematically compared the properties of CNVs we identified in the three groups with the data from representatives of EEA and WEA. The analyses indicated a typical genetic admixture pattern in all three groups with ancestries from both EEA and WEA. We also identified several CNV regions showing significant deviation of allele frequency from the expected genome-wide distribution, which might be associated with population-specific phenotypes. Our study provides the first genome-wide perspective on the CNVs of three major Xinjiang ethnic minority groups and has implications for both evolutionary and medical studies.
- Published
- 2014
31. Genome-wide Variants of Eurasian Facial Shape Differentiation and a prospective model of DNA based Face Prediction
- Author
-
Jingze Tan, Yaqun Guan, Pengcheng Fu, Dongsheng Lu, Sile Hu, Shuhua Xu, Sijie Wu, Hang Zhou, Kun Tang, Sijia Wang, Yan Lu, Li Jin, Shouneng Peng, Haiyi Lou, Lu Qiao, Yajun Yang, and Jing Guo
- Subjects
0301 basic medicine ,Adult ,Male ,China ,Han chinese ,Adolescent ,Genotype ,Biometrics ,Population ,Genome-wide association study ,Single-nucleotide polymorphism ,Biology ,Polymorphism, Single Nucleotide ,Genome ,White People ,03 medical and health sciences ,Young Adult ,Asian People ,Genetics ,Humans ,Prospective Studies ,education ,Molecular Biology ,Genetic association ,education.field_of_study ,Genetic Variation ,Cadherins ,Protocadherins ,Quantitative model ,Europe ,030104 developmental biology ,Geography ,Evolutionary biology ,Face ,Face (geometry) ,Related research ,Female ,Collagen ,Protein Tyrosine Phosphatases ,Cartography ,Genome-Wide Association Study - Abstract
It is a long standing question as to which genes define the characteristic facial features among different ethnic groups. In this study, we use Uyghurs, an ancient admixed population to query the genetic bases why Europeans and Han Chinese look different. Facial traits were analyzed based on high-dense 3D facial images; numerous biometric spaces were examined for divergent facial features between European and Han Chinese, ranging from inter-landmark distances to dense shape geometrics. Genome-wide association analyses were conducted on a discovery panel of Uyghurs. Six significant loci were identified four of which, rs1868752, rs118078182, rs60159418 at or near UBASH3B, COL23A1, PCDH7 and rs17868256 were replicated in independent cohorts of Uyghurs or Southern Han Chinese. A quantitative model was developed to predict 3D faces based on 277 top GWAS SNPs. In hypothetic forensic scenarios, this model was found to significantly enhance the verification rate, suggesting a practical potential of related research.
- Published
- 2016
- Full Text
- View/download PDF
32. A systematic characterization of genes underlying both complex and Mendelian diseases
- Author
-
Haiyi Lou, Pengfei Qin, Li Jin, Wenfei Jin, and Shuhua Xu
- Subjects
Gene Dosage ,Genome-wide association study ,Biology ,Gene dosage ,Evolution, Molecular ,symbols.namesake ,Gene mapping ,Protein Interaction Mapping ,Genetics ,Humans ,Disease ,Copy-number variation ,Selection, Genetic ,Molecular Biology ,Gene ,Genetics (clinical) ,Genetic Diseases, Inborn ,Proteins ,General Medicine ,Phenotype ,Housekeeping gene ,Genes ,Mendelian inheritance ,symbols - Abstract
Traditionally, genetic disorders have been classified as either Mendelian diseases or complex diseases. This nosology has greatly benefited genetic counseling and the development of gene mapping strategies. However, based on two well-established databases, we identified that 54% (524 of 968) of the Mendelian disease genes were also involved in complex diseases, and this kind of genes has not been systematically analyzed. Here, we classified human genes into five categories: Mendelian and complex disease (MC) genes, Mendelian but not complex disease (MNC) genes, complex but not Mendelian disease (CNM) genes, essential genes and OTHER genes. First, we found that MC genes were associated with more diseases and phenotypes, and were involved in more complex protein-protein interaction network than MNC or CNM genes on average. Secondly, MC genes encoded the longest proteins and had the highest transcript count among all gene categories. Especially, tissue specificity of MC genes was much higher than that of any other gene categories (P < 7.5 × 10(-5)), although their expression level was similar to that of essential genes. Thirdly, evidences from different aspects supported that MC genes have been subjected to both purifying and positive selection. Interestingly, functions of some human disease genes might be different from those of their orthologous genes in non-primate mammalians since they were even less conserved than OTHER genes. The significant over-representation of copy number variations (CNVs) in CNM genes suggested the important roles of CNVs in complex diseases. In brief, our study not only revealed the characteristics of MC genes, but also provided new insights into the other four gene categories.
- Published
- 2011
33. A Genome-Wide Search for Signals of High-Altitude Adaptation in Tibetans
- Author
-
Yajun Yang, Jingze Tan, Ling Yang, Shuhua Xu, Hongyan Wang, Yiping Shen, Bai-Lin Wu, Wenfei Jin, Shilin Li, Jiucun Wang, Xuedong Pan, Haiyi Lou, and Li Jin
- Subjects
Linkage disequilibrium ,Population ,Procollagen-Proline Dioxygenase ,Altitude Sickness ,Tibet ,Genome ,Hypoxia-Inducible Factor-Proline Dioxygenases ,Asian People ,Basic Helix-Loop-Helix Transcription Factors ,Genetics ,Humans ,education ,Molecular Biology ,Gene ,Ecology, Evolution, Behavior and Systematics ,education.field_of_study ,Natural selection ,biology ,Altitude ,Haplotype ,Haplotypes ,biology.protein ,Adaptation ,Genome-Wide Association Study ,EGLN1 - Abstract
Genetic studies of Tibetans, an ethnic group with a long-lasting presence on the Tibetan Plateau which is known as the highest plateau in the world, may offer a unique opportunity to understand the biological adaptations of human beings to high-altitude environments. We conducted a genome-wide study of 1,000,000 genetic variants in 46 Tibetans (TBN) and 92 Han Chinese (HAN) for identifying the signals of high-altitude adaptations (HAAs) in Tibetan genomes. We discovered the most differentiated variants between TBN and HAN at chromosome 1q42.2 and 2p21. EGLN1 (or HIFPH2, MIM 606425) and EPAS1 (or HIF2A, MIM 603349), both related to hypoxia-inducible factor, were found most differentiated in the two regions, respectively. Strong positive correlations were also observed between the frequency of TBN-dominant haplotypes in the two gene regions and altitude in East Asian populations. Linkage disequilibrium and further haplotype network analyses of world-wide populations suggested the antiquity of the TBN-dominant haplotypes and long-term persistence of the natural selection. Finally, a ‘‘dominant haplotype carrier’’ hypothesis could describe the role of the two genes in HAA. All of our population genomic and statistical analyses indicate that EPAS1 and EGLN1 are most likely responsible for HAA of Tibetans. Interestingly, one each but not both of the two genes were also identified by three recent studies. We reanalyzed the available data and found the escaped top signal (EPAS1) could be recaptured with data quality control and our approaches. Based on this experience, we call for more attention to be paid to controlling data quality and batch effects introduced in public data integration. Our results also suggest limitations of extended haplotype homozygositybased method due to its compromised power in case the natural selection initiated long time ago and particularly in genomic regions with recombination hotspots.
- Published
- 2010
34. Quantitating and Dating Recent Gene Flow between European and East Asian Populations
- Author
-
Shuhua Xu, Li Jin, Yeun-Jun Chung, Pengfei Qin, Dongsheng Lu, Haiyi Lou, Yuchen Wang, Ying Zhou, and Xiong Yang
- Subjects
Gene Flow ,Linkage disequilibrium ,education.field_of_study ,Multidisciplinary ,Models, Statistical ,Time Factors ,Models, Genetic ,Human migration ,business.industry ,Central asia ,Population ,Article ,White People ,Gene flow ,Evolution, Molecular ,Geography ,Genetics, Population ,Asian People ,Evolutionary biology ,Genetic structure ,Humans ,East Asia ,business ,education ,Historical record - Abstract
Historical records indicate that extensive cultural, commercial and technological interaction occurred between European and Asian populations. What have been the biological consequences of these contacts in terms of gene flow? We systematically estimated gene flow between Eurasian groups using genome-wide polymorphisms from 34 populations representing Europeans, East Asians and Central/South Asians. We identified recent gene flow between Europeans and Asians in most populations we studied, including East Asians and Northwestern Europeans, which are normally considered to be non-admixed populations. In addition we quantitatively estimated the extent of this gene flow using two statistical approaches and dated admixture events based on admixture linkage disequilibrium. Our results indicate that most genetic admixtures occurred between 2,400 and 310 years ago and show the admixture proportions to be highly correlated with geographic locations, with the highest admixture proportions observed in Central Asia and the lowest in East Asia and Northwestern Europe. Interestingly, we observed a North-to-South decline of European gene flow in East Asians, suggesting a northern path of European gene flow diffusing into East Asian populations. Our findings contribute to an improved understanding of the history of human migration and the evolutionary mechanisms that have shaped the genetic structure of populations in Eurasia.
- Published
- 2015
35. Genetic architectures of ADME genes in five Eurasian admixed populations and implications for drug safety and efficacy
- Author
-
Xiong Yang, Manshu Song, Shuhua Xu, Dolikun Mamatyusupu, Dongsheng Lu, Li Jin, Shilin Li, Xinwei Pan, Jing Li, Wenjun Yang, and Haiyi Lou
- Subjects
Genetics ,Gene Flow ,Genetic diversity ,Dose-Response Relationship, Drug ,Haplotype ,Genetic Variation ,Single-nucleotide polymorphism ,Biology ,Population stratification ,Polymorphism, Single Nucleotide ,White People ,Gene flow ,Genetics, Population ,Asian People ,Gene Frequency ,Haplotypes ,Ethnicity ,Humans ,Pharmacokinetics ,Allele frequency ,Genetics (clinical) ,ADME ,SNP array - Abstract
Background Drug absorption, distribution, metabolism and excretion (ADME) contribute to the high heterogeneity of drug responses in humans. However, the same standard for drug dosage has been applied to all populations in China although genetic differences in ADME genes are expected to exist in different ethnic groups. In particular, the ethnic minorities in northwestern China with substantial ancestry contribution from Western Eurasian people might violate such a single unified standard. Methods In this study, we used Affymetrix SNP Array 6.0 to investigate the genetic diversity of 282 ADME genes in five northwestern Chinese minority populations, namely, Tajik, Uyghur, Kazakh, Kirgiz and Hui, and attempted to identify the highly differential SNPs and haplotypes and further explore their clinical implications. Results We found that genetic diversity of many ADME genes in the five minority groups was substantially different from those in the Han Chinese population. For instance, we identified 10 functional SNPs with substantial allele frequency differences, 14 functional SNPs with highly different heterozygous states and eight genes with significant haplotype differences between these admixed minority populations and the Han Chinese population. We further confirmed that these differences mainly resulted from the European gene flow, that is, this gene flow increased the genetic diversity in the admixed populations. Conclusions These results suggest that the ADME genes vary substantially among different Chinese ethnic groups. We suggest it could cause potential clinical risk if the same dosage of substances (eg, antitumour drugs) is used without considering population stratification.
- Published
- 2014
36. Genome-wide comparison of allele-specific gene expression between African and European populations.
- Author
-
Lei Tian, Khan, Asifullah, Zhilin Ning, Kai Yuan, Chao Zhang, Haiyi Lou, Yuan Yuan, and Yuan Yuan
- Published
- 2018
- Full Text
- View/download PDF
37. Genomic Dissection of Population Substructure of Han Chinese and Its Implication in Association Studies
- Author
-
Yajun Yang, Xuejun Zhang, Jingze Tan, Yi Wang, Xuedong Pan, Xiaoli Chen, Shilin Li, Xianyong Yin, Xiaohong Gong, Yiping Shen, Jiucun Wang, Bai-Lin Wu, Shuhua Xu, Wenfei Jin, Yu An, Ji Qian, Ling Yang, Yangfei Sun, Xin Zhang, Hongyan Wang, Yungang He, Haiyi Lou, Wenqing Fu, and Li Jin
- Subjects
Fatty Acid Desaturases ,China ,RNA, Untranslated ,Heart Diseases ,Population ,Genome-wide association study ,Single-nucleotide polymorphism ,Biology ,Polymorphism, Single Nucleotide ,Article ,Major Histocompatibility Complex ,Asian People ,Polymorphism (computer science) ,Genetics ,Ethnicity ,Humans ,Psoriasis ,Genetics(clinical) ,False Positive Reactions ,education ,Genetics (clinical) ,Genetic association ,Oligonucleotide Array Sequence Analysis ,education.field_of_study ,Principal Component Analysis ,HCP5 ,Arthritis, Psoriatic ,Genetic Variation ,Genetics, Population ,Sample size determination ,RNA, Long Noncoding - Abstract
To date, most genome-wide association studies (GWAS) and studies of fine-scale population structure have been conducted primarily on Europeans. Han Chinese, the largest ethnic group in the world, composing 20% of the entire global human population, is largely underrepresented in such studies. A well-recognized challenge is the fact that population structure can cause spurious associations in GWAS. In this study, we examined population substructures in a diverse set of over 1700 Han Chinese samples collected from 26 regions across China, each genotyped at approximately 160K single-nucleotide polymorphisms (SNPs). Our results showed that the Han Chinese population is intricately substructured, with the main observed clusters corresponding roughly to northern Han, central Han, and southern Han. However, simulated case-control studies showed that genetic differentiation among these clusters, although very small (F(ST) = 0.0002 approximately 0.0009), is sufficient to lead to an inflated rate of false-positive results even when the sample size is moderate. The top two SNPs with the greatest frequency differences between the northern Han and southern Han clusters (F(ST)0.06) were found in the FADS2 gene, which associates with the fatty acid composition in phospholipids, and in the HLA complex P5 gene (HCP5), which associates with HIV infection, psoriasis, and psoriatic arthritis. Ingenuity Pathway Analysis (IPA) showed that most differentiated genes among clusters are involved in cardiac arteriopathy (p10(-101)). These signals indicating significant differences among Han Chinese subpopulations should be carefully explained in case they are also detected in association studies, especially when sample sources are diverse.
- Published
- 2009
38. Genetic History of Xinjiang's Uyghurs Suggests Bronze Age Multiple-Way Contacts in Eurasia.
- Author
-
Qidi Feng, Yan Lu, Xumin Ni, Kai Yuan, Yajun Yang, Xiong Yang, Chang Liu, Haiyi Lou, Zhilin Ning, Yuchen Wang, Dongsheng Lu, Chao Zhang, Ying Zhou, Meng Shi, Lei Tian, Xiaoji Wang, Xi Zhang, Jing Li, Khan, Asifullah, and Yaqun Guan
- Abstract
The Uyghur people residing in Xinjiang, a territory located in the far west of China and crossed by the Silk Road, are a key ethnic group for understanding the history of human dispersion in Eurasia. Here we assessed the genetic structure and ancestry of 951 Xinjiang's Uyghurs (XJU) representing 14 geographical subpopulations. We observed a southwest and northeast differentiation within XJU, which was likely shaped jointly by the Tianshan Mountains, which traverses from east to west as a natural barrier, and gene flow from both east and west directions. In XJU, we identified four major ancestral components that were potentially derived from two earlier admixed groups: one from the West, harboring European (25-37%) and South Asian ancestries (12-20%), and the other from the East, with Siberian (15-17%) and East Asian (29-47%) ancestries. By using a newly developed method, MultiWaver, the complex admixture history of XJU was modeled as a two-wave admixture. An ancient wave was dated back to ~3,750 years ago (ya), which is much earlier than that estimated by previous studies, but fits within the range of dating of mummies that exhibited European features that were discovered in the Tarim basin, which is situated in southern Xinjiang (4,000-2,000 ya); amore recent wave occurred around 750 ya, which is in agreement with the estimate from a recent study using other methods. We unveiled a more complex scenario of ancestral origins and admixture history in XJU than previously reported, which further suggests Bronze Age massive migrations in Eurasia and East-West contacts across the Silk Road. [ABSTRACT FROM AUTHOR]
- Published
- 2017
- Full Text
- View/download PDF
39. Assessing genome-wide copy number variation in the Han Chinese population.
- Author
-
Jianqi Lu, Haiyi Lou, Ruiqing Fu, Dongsheng Lu, Feng Zhang, Zhendong Wu, Xi Zhang, Changhua Li, Baijun Fang, Fangfang Pu, Jingning Wei, Qian Wei, Chao Zhang, Xiaoji Wang, Yan Lu, Shi Yan, Yajun Yang, Li Jin, and Shuhua Xu
- Abstract
Background Copy number variation (CNV) is a valuable source of genetic diversity in the human genome and a well-recognised cause of various genetic diseases. However, CNVs have been considerably underrepresented in population-based studies, particularly the Han Chinese which is the largest ethnic group in the world. Objectives To build a representative CNV map for the Han Chinese population. Methods We conducted a genome-wide CNV study involving 451 male Han Chinese samples from 11 geographical regions encompassing 28 dialect groups, representing a less-biased panel compared with the currently available data. We detected CNVs by using 4.2M NimbleGen comparative genomic hybridisation array and whole-genome deep sequencing of 51 samples to optimise the filtering conditions in CNV discovery. Results A comprehensive Han Chinese CNV map was built based on a set of high-quality variants (positive predictive value >0.8, with sizes ranging from 369 bp to 4.16 Mb and a median of 5907 bp). The map consists of 4012 CNV regions (CNVRs), and more than half are novel to the 30 East Asian CNV Project and the 1000 Genomes Project Phase 3. We further identified 81 CNVRs specific to regional groups, which was indicative of the subpopulation structure within the Han Chinese population. Conclusions Our data are complementary to public data sources, and the CNV map may facilitate in the identification of pathogenic CNVs and further biomedical research studies involving the Han Chinese population. [ABSTRACT FROM AUTHOR]
- Published
- 2017
- Full Text
- View/download PDF
40. Genetic architectures of ADME genes in five Eurasian admixed populations and implications for drug safety and efficacy.
- Author
-
Jing Li, Haiyi Lou, Xiong Yang, Dongsheng Lu, Shilin Li, Li Jin, Xinwei Pan, Wenjun Yang, Manshu Song, Dolikun Mamatyusupu, and Shuhua Xu
- Subjects
DRUG efficacy ,DRUG dosage ,ETHNIC groups ,HAPLOTYPES ,GENE flow - Abstract
Background Drug absorption, distribution, metabolism and excretion (ADME) contribute to the high heterogeneity of drug responses in humans. However, the same standard for drug dosage has been applied to all populations in China although genetic differences in ADME genes are expected to exist in different ethnic groups. In particular, the ethnic minorities in northwestern China with substantial ancestry contribution from Western Eurasian people might violate such a single unified standard. Methods In this study, we used Affymetrix SNP Array 6.0 to investigate the genetic diversity of 282 ADME genes in five northwestern Chinese minority populations, namely, Tajik, Uyghur, Kazakh, Kirgiz and Hui, and attempted to identify the highly differential SNPs and haplotypes and further explore their clinical implications. Results We found that genetic diversity of many ADME genes in the five minority groups was substantially different from those in the Han Chinese population. For instance, we identified 10 functional SNPs with substantial allele frequency differences, 14 functional SNPs with highly different heterozygous states and eight genes with significant haplotype differences between these admixed minority populations and the Han Chinese population. We further confirmed that these differences mainly resulted from the European gene flow, that is, this gene flow increased the genetic diversity in the admixed populations. Conclusions These results suggest that the ADME genes vary substantially among different Chinese ethnic groups. We suggest it could cause potential clinical risk if the same dosage of substances (eg, antitumour drugs) is used without considering population stratification. [ABSTRACT FROM AUTHOR]
- Published
- 2014
- Full Text
- View/download PDF
41. Identification of well-differentiated gene expressions between Han Chinese and Japanese using genomewide microarray data analysis.
- Author
-
Yuan Yuan, Ling Yang, Meng Shi, Dongsheng Lu, Haiyi Lou, Yi-Ping Phoebe Chen, Li Jin, and Shuhua Xu
- Subjects
GENE expression ,DNA microarrays ,HUMAN phenotype ,SPLICEOSOMES ,MESSENGER RNA ,RNA splicing - Abstract
Background Investigating variations in gene expression, which can be quantitatively measured on a genome-wide scale, is essential to understand and interpret phenotypic differences among human populations. Several previous studies have examined and compared variations in gene expression between continental populations. However, differences in gene expression variation between closely related populations have not been studied yet. Method We performed a genome-wide analysis and systematically compared expression profiles of Han Chinese with those of the Japanese population. Results We identified 768 genes (4.4% of 17 354 expressed genes) which were expressed differentially between the two populations, with 165 showing highly differential expression and enriched in genes involved in the spliceosome pathway, mRNA processing, mRNA metabolic process, RNA processing, RNA splicing and mitochondrial transport. We further identified cis- and trans-variants that regulated these differential gene expressions, and found that cis-variants shared in the two populations were centred within a range of 200 kb around transcription start site. Our analysis indicated that genetic differences in the cis-associated genes between the two populations could explain 7-43% of the identified expression divergence. Conclusions In summary, despite considerable heterogeneity, gene expression profiles between Han Chinese and Japanese did show an overall difference, with well-differentiated expressions regulated by genetic variants which have been reported associated with hematological and biochemical traits in Japanese populations. Our results supported that gene expression is regulated by genetic variants and there is a genetic basis for the phenotypic differences between Han Chinese and Japanese populations. [ABSTRACT FROM AUTHOR]
- Published
- 2013
- Full Text
- View/download PDF
42. A map of copy number variations in Chinese populations
- Author
-
Shilin Li, Bai-Lin Wu, Xin Zhang, Wenfei Jin, Yajun Yang, Li Jin, Longli Kang, Shuhua Xu, and Haiyi Lou
- Subjects
Linkage disequilibrium ,DNA Copy Number Variations ,Science ,Population ,Population genetics ,Single-nucleotide polymorphism ,Biology ,Asian People ,Genetics ,Ethnicity ,Humans ,Copy-number variation ,International HapMap Project ,education ,Genome Evolution ,Evolutionary Biology ,education.field_of_study ,Multidisciplinary ,Population Biology ,Genome, Human ,Computational Biology ,Genomic Evolution ,Human Genetics ,Genomics ,Chinese people ,Genetics, Population ,Medicine ,Human genome ,Population Genetics ,Research Article - Abstract
It has been shown that the human genome contains extensive copy number variations (CNVs). Investigating the medical and evolutionary impacts of CNVs requires the knowledge of locations, sizes and frequency distribution of them within and between populations. However, CNV study of Chinese minorities, which harbor the majority of genetic diversity of Chinese populations, has been underrepresented considering the same efforts in other populations. Here we constructed, to our knowledge, a first CNV map in seven Chinese populations representing the major linguistic groups in China with 1,440 CNV regions identified using Affymetrix SNP 6.0 Array. Considerable differences in distributions of CNV regions between populations and substantial population structures were observed. We showed that ∼35% of CNV regions identified in minority ethnic groups are not shared by Han Chinese population, indicating that the contribution of the minorities to genetic architecture of Chinese population could not be ignored. We further identified highly differentiated CNV regions between populations. For example, a common deletion in Dong and Zhuang (44.4% and 50%), which overlaps two keratin-associated protein genes contributing to the structure of hair fibers, was not observed in Han Chinese. Interestingly, the most differentiated CNV deletion between HapMap CEU and YRI containing CCL3L1 gene reported in previous studies was also the highest differentiated regions between Tibetan and other populations. Besides, by jointly analyzing CNVs and SNPs, we found a CNV region containing gene CTDSPL were in almost perfect linkage disequilibrium between flanking SNPs in Tibetan while not in other populations except HapMap CHD. Furthermore, we found the SNP taggability of CNVs in Chinese populations was much lower than that in European populations. Our results suggest the necessity of a full characterization of CNVs in Chinese populations, and the CNV map we constructed serves as a useful resource in further evolutionary and medical studies.
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.