74 results on '"Fuli, Yu"'
Search Results
2. A multi‐class COVID‐19 segmentation network with pyramid attention and edge loss in CT images
- Author
-
Fuli Yu, Yu Zhu, Xiangxiang Qin, Ying Xin, Dawei Yang, and Tao Xu
- Subjects
X‐rays and particle beams (medical uses) ,Patient diagnostic methods and instrumentation ,Optical, image and video signal processing ,Image recognition ,X‐ray techniques: radiography and computed tomography (biomedical imaging/measurement) ,Computer vision and image processing techniques ,Photography ,TR1-1050 ,Computer software ,QA76.75-76.765 - Abstract
Abstract At the end of 2019, a novel coronavirus COVID‐19 broke out. Due to its high contagiousness, more than 74 million people have been infected worldwide. Automatic segmentation of the COVID‐19 lesion area in CT images is an effective auxiliary medical technology which can quantitatively diagnose and judge the severity of the disease. In this paper, a multi‐class COVID‐19 CT image segmentation network is proposed, which includes a pyramid attention module to extract multi‐scale contextual attention information, and a residual convolution module to improve the discriminative ability of the network. A wavelet edge loss function is also proposed to extract edge features of the lesion area to improve the segmentation accuracy. For the experiment, a dataset of 4369 CT slices is constructed, including three symptoms: ground glass opacities, interstitial infiltrates, and lung consolidation. The dice similarity coefficients of three symptoms of the model achieve 0.7704, 0.7900, 0.8241 respectively. The performance of the proposed network on public dataset COVID‐SemiSeg is also evaluated. The results demonstrate that this model outperforms other state‐of‐the‐art methods and can be a powerful tool to assist in the diagnosis of positive infection cases, and promote the development of intelligent technology in the medical field.
- Published
- 2021
- Full Text
- View/download PDF
3. MSD-Net: Multi-Scale Discriminative Network for COVID-19 Lung Infection Segmentation on CT
- Author
-
Bingbing Zheng, Yaoqi Liu, Yu Zhu, Fuli Yu, Tianjiao Jiang, Dawei Yang, and Tao Xu
- Subjects
COVID-19 ,CT ,deep learning ,MSD segmentation network ,Electrical engineering. Electronics. Nuclear engineering ,TK1-9971 - Abstract
Since the first patient reported in December 2019, 2019 novel coronavirus disease (COVID-19) has become global pandemic with more than 10 million total confirmed cases and 500 thousand related deaths. Using deep learning methods to quickly identify COVID-19 and accurately segment the infected area can help control the outbreak and assist in treatment. Computed tomography (CT) as a fast and easy clinical method, it is suitable for assisting in diagnosis and treatment of COVID-19. According to clinical manifestations, COVID-19 lung infection areas can be divided into three categories: ground-glass opacities, interstitial infiltrates and consolidation. We proposed a multi-scale discriminative network (MSD-Net) for multi-class segmentation of COVID-19 lung infection on CT. In the MSD-Net, we proposed pyramid convolution block (PCB), channel attention block (CAB) and residual refinement block (RRB). The PCB can increase the receptive field by using different numbers and different sizes of kernels, which strengthened the ability to segment the infected areas of different sizes. The CAB was used to fusion the input of the two stages and focus features on the area to be segmented. The role of RRB was to refine the feature maps. Experimental results showed that the dice similarity coefficient (DSC) of the three infection categories were 0.7422,0.7384,0.8769 respectively. For sensitivity and specificity, the results of three infection categories were (0.8593, 0.9742), (0.8268,0.9869) and (0.8645,0.9889) respectively. The experimental results demonstrated that the network proposed in this paper can effectively segment the COVID-19 infection on CT images. It can be adopted for assisting in diagnosis and treatment of COVID-19.
- Published
- 2020
- Full Text
- View/download PDF
4. REDBot: Natural language process methods for clinical copy number variation reporting in prenatal and products of conception diagnosis
- Author
-
Mengmeng Liu, Yunshan Zhong, Hongqian Liu, Desheng Liang, Erhong Liu, Yu Zhang, Feng Tian, Qiaowei Liang, David S. Cram, Hua Wang, Lingqian Wu, and Fuli Yu
- Subjects
Genetics ,QH426-470 - Abstract
Abstract Background Current copy number variation (CNV) identification methods have rapidly become mature. However, the postdetection processes such as variant interpretation or reporting are inefficient. To overcome this situation, we developed REDBot as an automated software package for accurate and direct generation of clinical diagnostic reports for prenatal and products of conception (POC) samples. Methods We applied natural language process (NLP) methods for analyzing 30,235 in‐house historical clinical reports through active learning, and then, developed clinical knowledge bases, evidence‐based interpretation methods and reporting criteria to support the whole postdetection pipeline. Results Of the 30,235 reports, we obtained 37,175 CNV‐paragraph pairs. For these pairs, the active learning approaches achieved a 0.9466 average F1‐score in sentence classification. The overall accuracy for variant classification was 95.7%, 95.2%, and 100.0% in retrospective, prospective, and clinical utility experiments, respectively. Conclusion By integrating NLP methods in CNVs postdetection pipeline, REDBot is a robust and rapid tool with clinical utility for prenatal and POC diagnosis.
- Published
- 2020
- Full Text
- View/download PDF
5. Extremely low-coverage whole genome sequencing in South Asians captures population genomics information
- Author
-
Navin Rustagi, Anbo Zhou, W. Scott Watkins, Erika Gedvilaite, Shuoguo Wang, Naveen Ramesh, Donna Muzny, Richard A. Gibbs, Lynn B. Jorde, Fuli Yu, and Jinchuan Xing
- Subjects
Single nucleotide variant ,Whole genome sequencing ,South Asian ,Extremely low coverage ,Population structure ,Imputation ,Biotechnology ,TP248.13-248.65 ,Genetics ,QH426-470 - Abstract
Abstract Background The cost of Whole Genome Sequencing (WGS) has decreased tremendously in recent years due to advances in next-generation sequencing technologies. Nevertheless, the cost of carrying out large-scale cohort studies using WGS is still daunting. Past simulation studies with coverage at ~2x have shown promise for using low coverage WGS in studies focused on variant discovery, association study replications, and population genomics characterization. However, the performance of low coverage WGS in populations with a complex history and no reference panel remains to be determined. Results South Indian populations are known to have a complex population structure and are an example of a major population group that lacks adequate reference panels. To test the performance of extremely low-coverage WGS (EXL-WGS) in populations with a complex history and to provide a reference resource for South Indian populations, we performed EXL-WGS on 185 South Indian individuals from eight populations to ~1.6x coverage. Using two variant discovery pipelines, SNPTools and GATK, we generated a consensus call set that has ~90% sensitivity for identifying common variants (minor allele frequency ≥ 10%). Imputation further improves the sensitivity of our call set. In addition, we obtained high-coverage for the whole mitochondrial genome to infer the maternal lineage evolutionary history of the Indian samples. Conclusions Overall, we demonstrate that EXL-WGS with imputation can be a valuable study design for variant discovery with a dramatically lower cost than standard WGS, even in populations with a complex history and without available reference data. In addition, the South Indian EXL-WGS data generated in this study will provide a valuable resource for future Indian genomic studies.
- Published
- 2017
- Full Text
- View/download PDF
6. Reduced meiotic recombination in rhesus macaques and the origin of the human recombination landscape.
- Author
-
Cheng Xue, Navin Rustagi, Xiaoming Liu, Muthuswamy Raveendran, R Alan Harris, Manjunath Gorentla Venkata, Jeffrey Rogers, and Fuli Yu
- Subjects
Medicine ,Science - Abstract
Characterizing meiotic recombination rates across the genomes of nonhuman primates is important for understanding the genetics of primate populations, performing genetic analyses of phenotypic variation and reconstructing the evolution of human recombination. Rhesus macaques (Macaca mulatta) are the most widely used nonhuman primates in biomedical research. We constructed a high-resolution genetic map of the rhesus genome based on whole genome sequence data from Indian-origin rhesus macaques. The genetic markers used were approximately 18 million SNPs, with marker density 6.93 per kb across the autosomes. We report that the genome-wide recombination rate in rhesus macaques is significantly lower than rates observed in apes or humans, while the distribution of recombination across the macaque genome is more uniform. These observations provide new comparative information regarding the evolution of recombination in primates.
- Published
- 2020
- Full Text
- View/download PDF
7. Association of Single Nucleotide Polymorphisms in the ST3GAL4 Gene with VWF Antigen and Factor VIII Activity.
- Author
-
Jaewoo Song, Cheng Xue, John S Preisser, Drake W Cramer, Katie L Houck, Guo Liu, Aaron R Folsom, David Couper, Fuli Yu, and Jing-Fei Dong
- Subjects
Medicine ,Science - Abstract
VWF is extensively glycosylated with biantennary core fucosylated glycans. Most N-linked and O-linked glycans on VWF are sialylated. FVIII is also glycosylated, with a glycan structure similar to that of VWF. ST3GAL sialyltransferases catalyze the transfer of sialic acids in the α2,3 linkage to termini of N- and O-glycans. This sialic acid modification is critical for VWF synthesis and activity. We analyzed genetic and phenotypic data from the Atherosclerosis Risk in Communities (ARIC) study for the association of single nucleotide polymorphisms (SNPs) in the ST3GAL4 gene with plasma VWF levels and FVIII activity in 12,117 subjects. We also analyzed ST3GAL4 SNPs found in 2,535 subjects of 26 ethnicities from the 1000 Genomes (1000G) project for ethnic diversity, SNP imputation, and ST3GAL4 haplotypes. We identified 14 and 1,714 ST3GAL4 variants in the ARIC GWAS and 1000G databases respectively, with 46% being ethnically diverse in their allele frequencies. Among the 14 ST3GAL4 SNPs found in ARIC GWAS, the intronic rs2186717, rs7928391, and rs11220465 were associated with VWF levels and with FVIII activity after adjustment for age, BMI, hypertension, diabetes, ever-smoking status, and ABO. This study illustrates the power of next-generation sequencing in the discovery of new genetic variants and a significant ethnic diversity in the ST3GAL4 gene. We discuss potential mechanisms through which these intronic SNPs regulate ST3GAL4 biosynthesis and the activity that affects VWF and FVIII.
- Published
- 2016
- Full Text
- View/download PDF
8. A coupled model of electromagnetic and heat on nanosecond-laser ablation of impurity-containing aluminum alloy
- Author
-
Jiaxuan Chen, Jiaheng Yin, Yongda Yan, Yongzhi Cao, Fuli Yu, and Lihua Lu
- Subjects
Materials science ,Field (physics) ,General Chemical Engineering ,Alloy ,chemistry.chemical_element ,02 engineering and technology ,engineering.material ,01 natural sciences ,Electromagnetic radiation ,Condensed Matter::Materials Science ,Physics::Plasma Physics ,Impurity ,Aluminium ,0103 physical sciences ,Inertial confinement fusion ,010302 applied physics ,business.industry ,Finite-difference time-domain method ,General Chemistry ,021001 nanoscience & nanotechnology ,chemistry ,engineering ,Optoelectronics ,0210 nano-technology ,business ,Joule heating - Abstract
In the emerging field of laser-driven inertial confinement fusion, Joule heating generated via electromagnetic heating of the metal frame is a critical issue. However, there are few reported models explaining thermal damage to the aluminum alloy. The aim of this study was to build a coupled model for electromagnetic radiation and heat conversion of an ultrashort laser pulse on an aluminum alloy based on Ohm's law. Additionally, the application SiO2 films on aluminum alloy to improve the laser-induced damage threshold (LIDT) were simulated, and the effects of metal impurities in the aluminum alloy were analyzed. A model examining the relation between electromagnetic radiation and heat for a nanosecond laser irradiating an aluminum alloy was developed using a coupled model equation. The results obtained using the finite difference time domain (FDTD) algorithm can provide a theoretical basis for future improvement of the aluminum alloy LIDT.
- Published
- 2020
- Full Text
- View/download PDF
9. Population genomic analysis of 962 whole genome sequences of humans reveals natural selection in non-coding regions.
- Author
-
Fuli Yu, Jian Lu, Xiaoming Liu, Elodie Gazave, Diana Chang, Srilakshmi Raj, Haley Hunter-Zinck, Ran Blekhman, Leonardo Arbiza, Cris Van Hout, Alanna Morrison, Andrew D Johnson, Joshua Bis, L Adrienne Cupples, Bruce M Psaty, Donna Muzny, Jin Yu, Richard A Gibbs, Alon Keinan, Andrew G Clark, and Eric Boerwinkle
- Subjects
Medicine ,Science - Abstract
Whole genome analysis in large samples from a single population is needed to provide adequate power to assess relative strengths of natural selection across different functional components of the genome. In this study, we analyzed next-generation sequencing data from 962 European Americans, and found that as expected approximately 60% of the top 1% of positive selection signals lie in intergenic regions, 33% in intronic regions, and slightly over 1% in coding regions. Several detailed functional annotation categories in intergenic regions showed statistically significant enrichment in positively selected loci when compared to the null distribution of the genomic span of ENCODE categories. There was a significant enrichment of purifying selection signals detected in enhancers, transcription factor binding sites, microRNAs and target sites, but not on lincRNA or piRNAs, suggesting different evolutionary constraints for these domains. Loci in "repressed or low activity regions" and loci near or overlapping the transcription start site were the most significantly over-represented annotations among the top 1% of signals for positive selection.
- Published
- 2015
- Full Text
- View/download PDF
10. Clinical utility of noninvasive prenatal screening for expanded chromosome disease syndromes
- Author
-
Huaiyu Sun, Hu Tan, Mengnan Xu, David S. Cram, Feng Tian, Yu Zhang, Yingdi Liu, Lingqian Wu, Hua Wang, Fuli Yu, Hongmin Zhu, Desheng Liang, and Siyuan Linpeng
- Subjects
Adult ,medicine.medical_specialty ,Singleton pregnancy ,Adolescent ,DNA Copy Number Variations ,Cri du chat ,Noninvasive Prenatal Testing ,Aneuploidy ,Chromosome Disorders ,Trisomy ,Disease ,Young Adult ,Pregnancy ,Risk Factors ,Prenatal Diagnosis ,DiGeorge syndrome ,medicine ,Humans ,Sex Chromosome Aberrations ,Genetics (clinical) ,Chromosome Aberrations ,Fetus ,business.industry ,Obstetrics ,Chromosome ,Middle Aged ,medicine.disease ,Prenatal screening ,Karyotyping ,Female ,business ,Cell-Free Nucleic Acids - Abstract
To assess the clinical performance of an expanded noninvasive prenatal screening (NIPS) test (“NIPS-Plus”) for detection of both aneuploidy and genome-wide microdeletion/microduplication syndromes (MMS). A total of 94,085 women with a singleton pregnancy were prospectively enrolled in the study. The cell-free plasma DNA was directly sequenced without intermediate amplification and fetal abnormalities identified using an improved copy-number variation (CNV) calling algorithm. A total of 1128 pregnancies (1.2%) were scored positive for clinically significant fetal chromosome abnormalities. This comprised 965 aneuploidies (1.026%) and 163 (0.174%) MMS. From follow-up tests, the positive predictive values (PPVs) for T21, T18, T13, rare trisomies, and sex chromosome aneuploidies were calculated as 95%, 82%, 46%, 29%, and 47%, respectively. For known MMS (n = 32), PPVs were 93% (DiGeorge), 68% (22q11.22 microduplication), 75% (Prader–Willi/Angleman), and 50% (Cri du Chat). For the remaining genome-wide MMS (n = 88), combined PPVs were 32% (CNVs ≥10 Mb) and 19% (CNVs
- Published
- 2019
- Full Text
- View/download PDF
11. Possible race and gender divergence in association of genetic variations with plasma von Willebrand factor: a study of ARIC and 1000 genome cohorts.
- Author
-
Zhou Zhou, Fuli Yu, Ashley Buchanan, Yuanyuan Fu, Marco Campos, Kenneth K Wu, Lloyd E Chambless, Aaron R Folsom, Eric Boerwinkle, and Jing-fei Dong
- Subjects
Medicine ,Science - Abstract
The synthesis, secretion and clearance of von Willebrand factor (VWF) are regulated by genetic variations in coding and promoter regions of the VWF gene. We have previously identified 19 single nucleotide polymorphisms (SNPs), primarily in introns that are associated with VWF antigen levels in subjects of European descent. In this study, we conducted race by gender analyses to compare the association of VWF SNPs with VWF antigen among 10,434 healthy Americans of European (EA) or African (AA) descent from the Atherosclerosis Risk in Communities (ARIC) study. Among 75 SNPs analyzed, 13 and 10 SNPs were associated with VWF antigen levels in EA male and EA female subjects, respectively. However, only one SNP (RS1063857) was significantly associated with VWF antigen in AA females and none was in AA males. Haplotype analysis of the ARIC samples and studying racial diversities in the VWF gene from the 1000 genomes database suggest a greater degree of variations in the VWF gene in AA subjects as compared to EA subjects. Together, these data suggest potential race and gender divergence in regulating VWF expression by genetic variations.
- Published
- 2014
- Full Text
- View/download PDF
12. Positive selection of a pre-expansion CAG repeat of the human SCA2 gene.
- Author
-
Fuli Yu, Pardis C Sabeti, Paul Hardenbol, Qing Fu, Ben Fry, Xiuhua Lu, Sy Ghose, Richard Vega, Ag Perez, Shiran Pasternak, Suzanne M Leal, Thomas D Willis, David L Nelson, John Belmont, and Richard A Gibbs
- Subjects
Genetics ,QH426-470 - Abstract
A region of approximately one megabase of human Chromosome 12 shows extensive linkage disequilibrium in Utah residents with ancestry from northern and western Europe. This strikingly large linkage disequilibrium block was analyzed with statistical and experimental methods to determine whether natural selection could be implicated in shaping the current genome structure. Extended Haplotype Homozygosity and Relative Extended Haplotype Homozygosity analyses on this region mapped a core region of the strongest conserved haplotype to the exon 1 of the Spinocerebellar ataxia type 2 gene (SCA2). Direct DNA sequencing of this region of the SCA2 gene revealed a significant association between a pre-expanded allele [(CAG)8CAA(CAG)4CAA(CAG)8] of CAG repeats within exon 1 and the selected haplotype of the SCA2 gene. A significantly negative Tajima's D value (-2.20, p < 0.01) on this site consistently suggested selection on the CAG repeat. This region was also investigated in the three other populations, none of which showed signs of selection. These results suggest that a recent positive selection of the pre-expansion SCA2 CAG repeat has occurred in Utah residents with European ancestry.
- Published
- 2005
- Full Text
- View/download PDF
13. Effect of cathode composition on microstructure and tribological properties of TiBN nanocomposite multilayer coating synthesized by plasma immersion ion implantation and deposition
- Author
-
Langping Wang, Yongda Yan, Wen-quan Lü, Fuli Yu, Yongzhi Cao, Zhi-wei Gu, and Xiaofeng Wang
- Subjects
010302 applied physics ,Nanocomposite ,Materials science ,Metals and Alloys ,General Engineering ,chemistry.chemical_element ,02 engineering and technology ,engineering.material ,021001 nanoscience & nanotechnology ,Microstructure ,01 natural sciences ,Plasma-immersion ion implantation ,Nanocrystalline material ,Coating ,chemistry ,0103 physical sciences ,engineering ,Composite material ,Fourier transform infrared spectroscopy ,0210 nano-technology ,High-resolution transmission electron microscopy ,Tin - Abstract
Nanocomposite multilayer TiBN coatings were prepared on Si (100) and 9Cr18Mo substrates using TiBN composite cathode plasma immersion ion implantation and deposition technique (PIIID). Synthesis of TiBN composite cathodes was conducted by powder metallurgy technology and the content of hexagonal boron nitride (h-BN) was changed from 8% to 40% (mass fraction). The as-deposited coatings were characterized by energy dispersive spectrometer (EDS), grazing incidence X-ray diffraction (GIXRD), Fourier Transform Infrared Spectroscopy (FTIR) and high resolution transmission electron microcopy (HRTEM). EDS results show that the B content of the coatings was varied from 3.71% to 13.84% (molar fraction) when the composition of the h-BN in the composited cathodes was changed from 8 % to 40% (mass fraction). GIXRD results reveal that the TiBN coatings with a B content of 8% has the main diffraction peak of TiN (200), (220) and (311), and these peaks disappear when the B content is increased. FTIR analysis of the multilayer coatings showed the presence of h-BN in all coatings. TEM images reveal that all coatings have the characteristics of self-forming nanocomposite multilayers, where the nanocomposites are composed of face-centered cubic TiN or h-BN nanocrystalline embedded in amorphous matrix. The tribological tests reveal that the TiBN coatings exhibit a marked decrease of coefficient at room temperature (~0.25). The improved properties were found to be derived from the comprehensiveness of the self-forming multilayers structure and the h-BN solid lubrication effects in the coatings.
- Published
- 2017
- Full Text
- View/download PDF
14. Characterization of chromosomal abnormalities in pregnancy losses reveals critical genes and loci for human early development
- Author
-
Amy M. Breman, Sau Wai Cheung, Desheng Liang, Lingqian Wu, Janice L. Smith, Hua Wang, Fuli Yu, Zhilin Ren, Hongmin Zhu, Ankita Patel, David S Cram, Yiyun Chen, Justin Bartanus, and Pawel Stankiewicz
- Subjects
0301 basic medicine ,Candidate gene ,DNA Copy Number Variations ,Embryonic Development ,Chromosome Disorders ,Biology ,Article ,Mice ,03 medical and health sciences ,Pregnancy ,Genetics ,Animals ,Humans ,Gene family ,Hox gene ,Gene ,Zebrafish ,Genetics (clinical) ,Chromosome Aberrations ,Genome, Human ,Microarray analysis techniques ,Microarray Analysis ,3. Good health ,030104 developmental biology ,Zebrafish Model Organism Database ,Homeobox ,Female ,Zebrafish Information Network genome database ,Transcription Factors - Abstract
Detailed characterization of chromosomal abnormalities, a common cause for congenital abnormalities and pregnancy loss, is critical for elucidating genes for human fetal development. Here, 2186 product of conception (POC) samples were tested for copy number variations (CNVs) at two clinical diagnostic centers using whole genome sequencing and high-resolution chromosomal microarray analysis. We developed a new gene discovery approach to predict potential developmental genes and identified 275 candidate genes from CNVs detected from both datasets. Based on Mouse Genome Informatics (MGI) and Zebrafish model organism database (ZFIN), 75% of identified genes could lead to developmental defects when mutated. Genes involved in embryonic development, gene transcription and regulation of biological processes were significantly enriched. Especially, transcription factors and gene families sharing specific protein domains predominated, which included known developmental genes such as HOX, NKX homeodomain genes and helix-loop-helix containing HAND2, NEUROG2 and NEUROD1 as well as potential novel developmental genes. We observed that developmental genes were denser in certain chromosomal regions, enabling identification of 31 potential genomic loci with clustered genes associated with development.
- Published
- 2017
- Full Text
- View/download PDF
15. Practical Approaches for Whole-Genome Sequence Analysis of Heart- and Blood-Related Traits
- Author
-
Fuli Yu, Josef Coresh, Zhuoyi Huang, Ginger A. Metcalf, Xiaoming Liu, Navin Rustagi, Elena V. Feofanova, Alanna C. Morrison, Christie M. Ballantyne, Bing Yu, Richard A. Gibbs, Donna M. Muzny, and Eric Boerwinkle
- Subjects
0301 basic medicine ,Neutrophils ,Quantitative Trait Loci ,Genomics ,Locus (genetics) ,Genome-wide association study ,Computational biology ,030105 genetics & heredity ,Quantitative trait locus ,Biology ,Polymorphism, Single Nucleotide ,Genome ,Article ,White People ,Hemoglobins ,Leukocyte Count ,03 medical and health sciences ,Gene Frequency ,Troponin T ,Natriuretic Peptide, Brain ,Genetics ,Humans ,Magnesium ,Gene ,Genetics (clinical) ,Genome, Human ,Platelet Count ,Cholesterol, HDL ,Phosphorus ,Cholesterol, LDL ,Introns ,Peptide Fragments ,Genetic architecture ,Black or African American ,C-Reactive Protein ,030104 developmental biology ,Human genome ,Chromosomes, Human, Pair 9 ,Genome-Wide Association Study ,Lipoprotein(a) - Abstract
Whole-genome sequencing (WGS) allows for a comprehensive view of the sequence of the human genome. We present and apply integrated methodologic steps for interrogating WGS data to characterize the genetic architecture of 10 heart- and blood-related traits in a sample of 1,860 African Americans. In order to evaluate the contribution of regulatory and non-protein coding regions of the genome, we conducted aggregate tests of rare variation across the entire genomic landscape using a sliding window, complemented by an annotation-based assessment of the genome using predefined regulatory elements and within the first intron of all genes. These tests were performed treating all variants equally as well as with individual variants weighted by a measure of predicted functional consequence. Significant findings were assessed in 1,705 individuals of European ancestry. After these steps, we identified and replicated components of the genomic landscape significantly associated with heart- and blood-related traits. For two traits, lipoprotein(a) levels and neutrophil count, aggregate tests of low-frequency and rare variation were significantly associated across multiple motifs. For a third trait, cardiac troponin T, investigation of regulatory domains identified a locus on chromosome 9. These practical approaches for WGS analysis led to the identification of informative genomic regions and also showed that defined non-coding regions, such as first introns of genes and regulatory domains, are associated with important risk factor phenotypes. This study illustrates the tractable nature of WGS data and outlines an approach for characterizing the genetic architecture of complex traits.
- Published
- 2017
- Full Text
- View/download PDF
16. Residual thermal stress of a mounted KDP crystal after cooling and its effects on second harmonic generation of a high-average-power laser
- Author
-
Haitao Liu, Fuli Yu, Ruifeng Su, and Yingchun Liang
- Subjects
Materials science ,business.industry ,Solid-state ,Physics::Optics ,Nonlinear optics ,Second-harmonic generation ,02 engineering and technology ,Laser ,01 natural sciences ,Atomic and Molecular Physics, and Optics ,Electronic, Optical and Magnetic Materials ,law.invention ,Power (physics) ,010309 optics ,Crystal ,020210 optoelectronics & photonics ,Optics ,law ,0103 physical sciences ,Thermal ,0202 electrical engineering, electronic engineering, information engineering ,Residual thermal stress ,Electrical and Electronic Engineering ,business - Abstract
Thermal problems are huge challenges for solid state lasers that are interested in high output power, cooling of the nonlinear optics is insufficient to completely solve the problem of thermally induced stress, as residual thermal stress remains after cooling, which is first proposed, to the best of our knowledge. In this paper a comprehensive model incorporating principles of thermodynamics, mechanics and optics is proposed, and it is used to study the residual thermal stress of a mounted KDP crystal after cooling process from mechanical perspective, along with the effects of the residual thermal stress on the second harmonic generation (SHG) efficiency of a high-average-power laser. Effects of the structural parameters of the mounting configuration of the KDP crystal on the residual thermal stress are characterized, as well as the SHG efficiency. The numerical results demonstrate the feasibility of solving the problems of residual thermal stress from the perspective on structural design of mounting configuration.
- Published
- 2017
- Full Text
- View/download PDF
17. Plastic deformation mechanisms in face-centered cubic materials with low stacking fault energy
- Author
-
Xuesen Zhao, Fuli Yu, Youfang Cao, Wendi Tu, and Yongda Yan
- Subjects
010302 applied physics ,Equiaxed crystals ,Materials science ,Mechanical Engineering ,Metallurgy ,02 engineering and technology ,Cubic crystal system ,021001 nanoscience & nanotechnology ,Condensed Matter Physics ,01 natural sciences ,Grain size ,Mechanics of Materials ,Stacking-fault energy ,0103 physical sciences ,General Materials Science ,Deformation (engineering) ,Dislocation ,Composite material ,Severe plastic deformation ,0210 nano-technology ,Stacking fault - Abstract
The microstructural evolution process of face-centered cubic Cu-30 wt%Zn, which has very low stacking fault (SF) energy of only 14 mJ/m 2 , processed by high-pressure torsion, was investigated using transmission electron microscopy. Results reveal that deformation SFs/twin boundaries and cell blocks play the key role in the grain refinement process from ultrafine grains to nano grains. Equiaxed coarse grains with grain sizes of several microns were refined to ultrafine grains through the formation of high density of SFs, twins and cell blocks. With the accumulation of high density of dislocations at SF/twin boundaries, the emission of secondary SFs/twins further refined grains and transformed ultrafine grains into equiaxed grains with grain size of several tens nanometers. The observed grain refinement mechanism is significantly different from those of materials with medium to high SF energies in which full dislocation activities play a key role for grain refinement.
- Published
- 2016
- Full Text
- View/download PDF
18. Base-Biased Evolution of Disease-Associated Mutations in the Human Genome
- Author
-
Cheng Xue, Hua Chen, and Fuli Yu
- Subjects
0301 basic medicine ,Genetics ,education.field_of_study ,Population ,Biology ,Genome ,Genetic load ,03 medical and health sciences ,Fixation (population genetics) ,Negative selection ,030104 developmental biology ,0302 clinical medicine ,Human genome ,Gene conversion ,1000 Genomes Project ,education ,030217 neurology & neurosurgery ,Genetics (clinical) - Abstract
Understanding the evolution of disease-associated mutations is fundamental to analyze pathogenetics of diseases. Mutation, recombination (by GC-biased gene conversion, gBGC), and selection have been known to shape the evolution of disease-associated mutations, but how these evolutionary forces work together is still an open question. In this study, we analyzed several human large-scale datasets (1000 Genomes, ESP6500, ExAC and ClinVar), and found that base-biased mutagenesis generates more GC→AT than AT→GC mutations, while gBGC promotes the fixation of AT→GC mutations to balance the impact of base-biased mutation on genome. Due to this effect of gBGC, purifying selection removes more deleterious AT→GC mutations than GC→AT from population, but many high-frequency (fixed and nearly fixed) deleterious AT→GC mutations are remained possibly due to high genetic load. As a special subset, disease-associated mutations follow this evolutionary rule, in which disease-associated GC→AT mutations are more enriched in rare mutations compared with AT→GC, while disease-associated AT→GC are more enriched in mutations with high frequency. Thus, we presented a base-biased evolutionary framework that explains the base-biased generation and accumulation of disease-associated mutations in human populations.
- Published
- 2016
- Full Text
- View/download PDF
19. Study on the reflectivity of electron beam evaporated gold films on aluminum alloy substrates treated at 60, −20, and 25°C
- Author
-
Jiaheng Yin, Fuli Yu, Yongzhi Cao, Yongda Yan, Lihua Lu, and Jiaxuan Chen
- Subjects
010302 applied physics ,Total internal reflection ,Materials science ,Alloy ,Metals and Alloys ,chemistry.chemical_element ,02 engineering and technology ,Surfaces and Interfaces ,Substrate (electronics) ,Adhesion ,engineering.material ,021001 nanoscience & nanotechnology ,01 natural sciences ,Evaporation (deposition) ,Surfaces, Coatings and Films ,Electronic, Optical and Magnetic Materials ,chemistry ,Coating ,Aluminium ,0103 physical sciences ,Materials Chemistry ,engineering ,Composite material ,0210 nano-technology ,Deposition (law) - Abstract
Aluminum alloys are widely applied in optical turrets after coating and super-finishing. Gold films prepared on aluminum alloy substrates via electron beam (e-beam) evaporation are considered to be effective way to realize close to total reflection of incident light. Here, we investigated the optimization of the reflectivity parameters of e-beam evaporated gold films; then, the influence of different deposition parameters on the surface quality, adhesive force and reflectivity (incident light in the 650–1700 nm range) of the film at −20, 25 and 60°C were systematically studied. The results demonstrated that the reflectivity and adhesion of the gold films both increased after high temperature holding and decreased slightly after low temperature holding. However, the surface morphology of the gold film did not change substantially. After holding at 60 and −20°C, the adhesive force decreased, which indicated that the adhesion strength between the reflective membrane and the substrate decreased.
- Published
- 2021
- Full Text
- View/download PDF
20. Spontaneous hyperactivity in Ash1l mutant mice, a new model for Tourette syndrome
- Author
-
Yuze Yan, Yinlin Ge, Hui Liang, Tao Zhu, Jingli Wang, Yanzhao Wei, Xueping Zheng, Fuli Yu, Yinglei Xu, Xu Ma, Zhongcui Jing, Xiuhai Wang, Haiyan Wang, Christian P. Schaaf, Wenmiao Liu, Chunmei Wu, Zhaochuan Yang, Yeting Zhang, Huanhuan Huang, Miaomiao Tian, Zuzhou Huang, Qinan Chen, Ru Zhang, Xue Sun, Hongzai Guan, Lanlan Zheng, Chuanyue Wang, Ni Ran, Ji-Song Guan, Yixia Guo, Hui Li, Jinchuan Xing, Shiguo Liu, Hao Deng, Mingji Yi, Xueying Feng, Mengmeng Han, Wenhan Luo, Jiani Li, Guiju Wang, Yucui Zang, Xiangrong Sun, Xuzhan Zhang, Lang Chen, Fan He, Fengyuan Che, Yi Zheng, and Hong Xie
- Subjects
Cellular and Molecular Neuroscience ,Psychiatry and Mental health ,medicine.medical_specialty ,Endocrinology ,business.industry ,Internal medicine ,Mutant ,medicine ,business ,medicine.disease ,Molecular Biology ,Tourette syndrome - Published
- 2020
- Full Text
- View/download PDF
21. Abstract P096: The Genetics Architecture of the Serum Metabolome
- Author
-
Zhe Wang, Bing Yu, Paul S De Vries, Elena V Feofanova, Fuli Yu, Richard A Gibbs, Alanna C Morrison, and Eric Boerwinkle
- Subjects
Physiology (medical) ,Cardiology and Cardiovascular Medicine - Abstract
Introduction: The metabolome is a collection of small molecules in a biologic sample, and may serve as biomarkers or predictors of heart disease. Whole genome sequence analysis offers the opportunity to investigate rare and low-frequency annotated variants across the human genome. We used whole genome sequence analysis to characterize the genetic architecture of the serum metabolome. Methods: Whole genome sequencing and measurement (chromotagraphy and mass spectroscopy) of 245 serum metabolites were done in 1,458 European Americans and 1,679 African Americans from the Atherosclerosis Risk in Communities (ARIC) study, and these data were used to perform a trans-ethnic meta-analysis. Common variants (MAF>5%) were analyzed individually using an additive genetic model. Rare and low-frequency protein-altering variants (MAF≤5%) were aggregated by genes. In order to determine the contribution of regulatory and non-protein coding regions of the genome, we conducted aggregate tests across the entire genome using a 4kb sliding window as well as in predefined regulatory elements, which includs enhancers, promoter, and 3’ and 5’ untranslated region of a gene. Results: We identified 119 significant associations between genetic variants and metabolite levels (significance threshold p-10 for single variants, p-10 for aggregate tests), of which 49 were novel, including genes involved in known Mendelian conditions, protein biological processes, and disease related pathways. Six genes ( DMGDH, AGA, ACY1, PRODH, DDC and CPS1 ) causing rare inborn errors of metabolism were associated with amino acid levels in the general population. A predicated regulatory variant in the AGA gene, encoding a protein involved in asparagine generation, was associated with serum asparagine levels independent of any coding variants in this gene. Seven genes ( ABCC2, PKD2L1, SLC10A1, FDX1, CYP3A43, UGT2B15 and SULT2A1 ) related to lipid-related metabolite levels were identified, whose gene products are involved in secretion, channeling and transportation. Analysis of regulatory regions unraveled associations between three steroid lipids and a member of the cytochrome P450 family, CYP3A43 . Five genes within the kinin-kallikrein pathway were identified to be related to small peptide levels, including KLKB1 , KNG1 , F12 , ACE and CPN1 . Variants in CPN1 , which is known to bind to fibrinogen, were associated with DSGEGDFXAEGGGVR, a peptide which is produced during fibrinogen to fibrin conversion. Conclusion: This study outlines an approach to characterize the genetic architecture of the human serum metabolome and shows that sequence variants affect multiple human metabolites. Using the principle of Mendelian randomization, the next step is to determine whether any of these metabolites are in causal pathways to disease.
- Published
- 2017
- Full Text
- View/download PDF
22. Effect of incident angle on thin film growth: A molecular dynamics simulation study
- Author
-
Chao Wu, Yongzhi Cao, Fuli Yu, and Junjie Zhang
- Subjects
Materials science ,Morphology (linguistics) ,genetic structures ,Condensed matter physics ,education ,Metals and Alloys ,Surfaces and Interfaces ,Substrate (electronics) ,Crystal structure ,Epitaxy ,Microstructure ,eye diseases ,Surfaces, Coatings and Films ,Electronic, Optical and Magnetic Materials ,Molecular dynamics ,Crystallography ,Physical vapor deposition ,Materials Chemistry ,Thin film - Abstract
In current work we perform molecular dynamics simulations to investigate the growth of Al thin film on Cu substrate through physical vapor deposition. The effects of incident angle on the morphology and the formed internal microstructures of Al thin films are emphasized. Simulation results show that Al thin films grow in the epitaxy growth mode of layer-by-layer fashion under incident energy of 0.1 eV. Further analysis of the internal microstructures demonstrates the formation of twin boundaries in Al thin films. It is found that the morphology of the island-like clusters formed in Al thin films varies significantly upon incident angle. The compositions of atoms of different lattice structures strongly depend on incident angle, which consequently affects the propensity of different internal microstructures formed in Al thin films.
- Published
- 2013
- Full Text
- View/download PDF
23. Radiation force of a high-energy laser and its effects on second-harmonic generation
- Author
-
Ruifeng Su, Yingchun Liang, Haitao Liu, and Fuli Yu
- Subjects
Materials science ,Materials Science (miscellaneous) ,Second-harmonic imaging microscopy ,Physics::Optics ,02 engineering and technology ,01 natural sciences ,Industrial and Manufacturing Engineering ,law.invention ,010309 optics ,Stress (mechanics) ,Condensed Matter::Materials Science ,Optics ,law ,0103 physical sciences ,Light beam ,Business and International Management ,business.industry ,Momentum transfer ,Nonlinear optics ,Second-harmonic generation ,021001 nanoscience & nanotechnology ,Laser ,Reflection (physics) ,Optoelectronics ,0210 nano-technology ,business - Abstract
The radiation force of a high-energy laser caused by reflection at the input surface of a mounted KHsub2/subPOsub4/sub(KDP) crystal is studied, along with its effects on the second-harmonic generation (SHG) efficiency of the laser beam. A comprehensive model incorporating principles of momentum transfer, mechanics, and optics is proposed, taking advantage of which, the mechanical stress within the KDP crystal that is caused by the radiation force, and the SHG efficiency that is affected by the stress are successively studied. Moreover, the effects of the intensity of the laser beam on the radiation force, the stress, and the SHG efficiency are determined, respectively. It demonstrates that a high-energy laser beam causes macroscopic radiation force and further contributes negative effects to SHG efficiency.
- Published
- 2017
24. Mechanical and tribological properties of Ni/Al multilayers—A molecular dynamics study
- Author
-
Fuli Yu, Yongzhi Cao, Yingchun Liang, Tao Sun, and Junjie Zhang
- Subjects
Materials science ,General Physics and Astronomy ,chemistry.chemical_element ,Surfaces and Interfaces ,General Chemistry ,Nanoindentation ,Tribology ,Condensed Matter Physics ,Indentation hardness ,Surfaces, Coatings and Films ,Condensed Matter::Materials Science ,Molecular dynamics ,Nickel ,Crystallography ,chemistry ,Nanometre ,Thin film ,Dislocation ,Composite material - Abstract
Mechanical and tribological properties of multilayers with nanometer thickness are strongly affected by interfaces formed due to mismatch of lattice parameters. In this study, molecular dynamics (MD) simulations of nanoindentation and following nanoscratching processes are performed to investigate the mechanical and tribological properties of Ni/Al multilayers with semi-coherent interface. The results show that the indentation hardness of Ni/Al multilayers is larger than pure Ni thin film, and the significant strength of Ni/Al multilayers is caused by the semi-coherent interface which acts as a barrier to glide of dislocations during nanoindentation process. The confinement of plastic deformation by the interface during nanoscratching on Ni/Al multilayers leads to smaller friction coefficient than pure Ni thin film. Dislocation evolution, interaction between gliding dislocations and interface, variations of indentation hardness and friction coefficient are studied.
- Published
- 2010
- Full Text
- View/download PDF
25. Atomistic study of deposition process of Al thin film on Cu substrate
- Author
-
Fuli Yu, Junjie Zhang, Yongda Yan, Tao Sun, and Yongzhi Cao
- Subjects
Materials science ,General Physics and Astronomy ,Nanotechnology ,Surfaces and Interfaces ,General Chemistry ,Substrate (electronics) ,Nanoindentation ,Condensed Matter Physics ,Epitaxy ,Surfaces, Coatings and Films ,Molecular dynamics ,Indentation ,Composite material ,Thin film ,Single crystal ,Deposition (law) - Abstract
In this paper we report molecular dynamics based atomistic simulations of deposition process of Al atoms onto Cu substrate and following nanoindentation process on that nanostructured material. Effects of incident energy on the morphology of deposited thin film and mechanical property of this nanostructured material are emphasized. The results reveal that the morphology of growing film is layer-by-layer-like at incident energy of 0.1–10 eV. The epitaxy mode of film growth is observed at incident energy below 1 eV, but film-mixing mode commences when incident energy increase to 10 eV accompanying with increased disorder of film structure, which improves quality of deposited thin film. Following indentation studies indicate deposited thin films pose lower stiffness than single crystal Al due to considerable amount of defects existed in them, but Cu substrate is strengthened by the interface generated from lattice mismatch between deposited Al thin film and Cu substrate.
- Published
- 2010
- Full Text
- View/download PDF
26. Detecting natural selection by empirical comparison to random regions of the genome
- Author
-
Christopher A. Walsh, Hua Chen, Robert Sean Hill, David Reich, Andre A. Mignault, Alon Keinan, Russell J. Ferland, and Fuli Yu
- Subjects
Neurogenesis ,Population ,Locus (genetics) ,Single-nucleotide polymorphism ,Nerve Tissue Proteins ,Biology ,Genome ,Polymorphism, Single Nucleotide ,Receptors, G-Protein-Coupled ,03 medical and health sciences ,0302 clinical medicine ,Gene Frequency ,Genetic variation ,Genetics ,Humans ,Computer Simulation ,Allele ,Selection, Genetic ,education ,Molecular Biology ,Allele frequency ,Genetics (clinical) ,030304 developmental biology ,Adaptor Proteins, Signal Transducing ,0303 health sciences ,education.field_of_study ,Natural selection ,Models, Genetic ,Genome, Human ,Forkhead Transcription Factors ,General Medicine ,Articles ,Sequence Analysis, DNA ,Adaptor Proteins, Vesicular Transport ,Haplotypes ,Evolutionary biology ,030217 neurology & neurosurgery - Abstract
Historical episodes of natural selection can skew the frequencies of genetic variants, leaving a signature that can persist for many tens or even hundreds of thousands of years. However, formal tests for selection based on allele frequency skew require strong assumptions about demographic history and mutation, which are rarely well understood. Here, we develop an empirical approach to test for signals of selection that compares patterns of genetic variation at a candidate locus with matched random regions of the genome collected in the same way. We apply this approach to four genes that have been implicated in syndromes of impaired neurological development, comparing the pattern of variation in our re-sequencing data with a large-scale, genomic data set that provides an empirical null distribution. We confirm a previously reported signal at FOXP2, and find a novel signal of selection centered at AHI1, a gene that is involved in motor and behavior abnormalities. The locus is marked by many high frequency derived alleles in non-Africans that are of low frequency in Africans, suggesting that selection at this or a closely neighboring gene occurred in the ancestral population of non-Africans. Our study also provides a prototype for how empirical scans for ancient selection can be carried out once many genomes are sequenced.
- Published
- 2009
27. A haplotype map of the human genome
- Author
-
Mark Leppert, Aravinda Chakravarti, Charmaine D.M. Royal, Sarah S. Murray, Renzong Qiu, Panos Deloukas, Renwu Wang, David A. Hinds, Barbara E. Stranger, Xiaoli Tang, Huanming Yang, John W. Belmont, Nigel P. Carter, Huy Nguyen, William Mak, Kazuto Kato, Shiran Pasternak, Chaohua Li, Jeffrey C. Barrett, Lon R. Cardon, Vincent Ferretti, Atsushi Nagashima, Peter E. Chen, Stephen F. Schaffner, Hongbo Fu, Zhu Chen, Siqi Liu, John Burton, Paul Hardenbol, Gudmundur A. Thorisson, Yusuke Nakamura, Mark Griffiths, Imtiaz Yakub, Eiko Suda, Gonçalo R. Abecasis, Carl S. Kashuk, Qingrun Zhang, Yoshimitsu Fukushima, Karen Kennedy, Sarah E. Hunt, Yi Wang, Norio Niikawa, Ichiro Matsuda, Lynn F. Zacharia, Lalitha Krishnan, Zhen Wang, Stéphanie Roumy, C M Clee, David J. Cutler, Albert V. Smith, Lincoln Stein, Simon Myers, Jane Peterson, Jun Zhou, Yozo Ohnishi, Weihua Guan, Matthew Stephens, Xiaoyan Xiong, Julian Maller, Houcan Zhang, Pui-Yan Kwok, Mark S. Guyer, Liuda Ziaugra, Jonathan Witonsky, Matthew C. Jones, Stacey Gabriel, You-Qiang Song, Daochang An, Haifeng Wang, Gilean McVean, Lawrence M. Sung, Zhijian Yao, Yan Shen, Yangfan Liu, George M. Weinstock, Ludmila Pawlikowska, Erica Sodergren, Mark T. Ross, Andrew Boudreau, Toshihiro Tanaka, Thomas D. Willis, Weitao Hu, Kelly A. Frazer, Li Jin, Robert W. Plumb, Paul I.W. de Bakker, Hongbin Zhao, Wei Lin, Sarah Sims, Richard A. Gibbs, Maura Faggart, Michael Feolo, Dennis G. Ballinger, Xun Chu, Lucinda Fulton, Marcos Delgado, Ellen Winchester, Wei Huang, Fuli Yu, Christianne R. Bird, Shaun Purcell, Jessica Roy, Dongmei Cai, Launa M. Galver, Bartha Maria Knoppers, Emmanouil T. Dermitzakis, Gao Yang, Takashi Morizono, Rachel Barry, Kirsten McLay, Daryl J. Thomas, Steve McCarroll, Jonathan Marchini, Daniel J. Richter, Andy Peiffer, Patricia Taillon-Miller, Richard K. Wilson, Stephen Kwok-Wing Tsui, Jian-Bing Fan, Lisa D. Brooks, Laura L. Stuve, Paul L'Archevêque, David M. Evans, Clémentine Sallée, Peter Donnelly, Hong Xue, Hui Zhao, Charles N. Rotimi, Jean E. McEwen, J. Tze Fei Wong, Hao Pan, Alastair Kent, Brendan Blumenstiel, Qing Li, Weiwei Sun, L. Kang, Colin Freeman, John Stewart, Chibuzor Nkwodimmah, Morris W. Foster, Don Powell, Leonardo Bottolo, Raymond D. Miller, Stephen T. Sherry, Francis S. Collins, Donna M. Muzny, Jun Yu, Ike Ajayi, Hua Han, Pardis C. Sabeti, Hongguang Wang, Takahisa Kawaguchi, Tatsuhiko Tsunoda, Guy Bellemare, Zhaohui S. Qin, H. B. Hu, Jane Rogers, Thomas J. Hudson, Mark J. Daly, Andrew P. Morris, Supriya Gupta, Ming Xiao, Patrick Varilly, Nick Patterson, Akihiro Sekine, Chris C. A. Spencer, Jonathan Morrison, Missy Dixon, Paul K.H. Tam, Jian Wang, Matthew Defelice, Susana Eyheramendy, Michael Shi, Yungang He, Ellen Wright Clayton, Richa Saxena, Heather M. Munro, Arthur L. Holden, Yayun Shen, Christine P. Bird, Bruce W. Birren, Itsik Pe'er, David R. Bentley, Lynne V. Nazareth, Pamela Whittaker, Pak C. Sham, Amy L. Camargo, David A. Wheeler, Koji Saeki, Martin Godbout, David Altshuler, Liang Xu, Ying Wang, David Willey, Alexandre Montpetit, Shin Lin, Michael S. Phillips, Changqing Zeng, Clement Adebamowo, John C. Wallenburg, Mark S. Chee, Ben Fry, Erich Stahl, Melissa Parkin, Rhian Gwilliam, Andrei Verner, Patrick J. Nailer, Lap-Chee Tsui, Bo Zhang, Fanny Chagnon, David R. Cox, Jack Spiegel, Jamie Moore, Vivian Ota Wang, Patricia A. Marshall, Takuya Kitamoto, Bruce S. Weir, Darryl Macer, Geraldine M. Clarke, Robert C. Onofrio, Mary M.Y. Waye, Wei Wang, Suzanne M. Leal, James C. Mullikin, Toyin Aniagwu, Daniel C. Koboldt, Mary Goyette, Martin Leboeuf, Isaac F. Adewole, Ruth Jamieson, Arnold Oliphant, Jessica Watkin, and Jean François Olivier
- Subjects
Linkage disequilibrium ,Biology ,DNA, Mitochondrial ,Polymorphism, Single Nucleotide ,Article ,Linkage Disequilibrium ,Structural variation ,Gene Frequency ,Humans ,Selection, Genetic ,International HapMap Project ,Genetic association ,Haplotypes - genetics ,Recombination, Genetic ,Genetics ,Chromosomes, Human, Y ,Multidisciplinary ,Genome, Human ,DNA, Mitochondrial - genetics ,Haplotype ,Tag SNP ,Polymorphism, Single Nucleotide - genetics ,Haplotypes ,Human genome ,Haplotype estimation ,Chromosomes, Human, Y - genetics - Abstract
Inherited genetic variation has a critical but as yet largely uncharacterized role in human disease. Here we report a public database of common variation in the human genome: more than one million single nucleotide polymorphisms (SNPs) for which accurate and complete genotypes have been obtained in 269 DNA samples from four populations, including ten 500-kilobase regions in which essentially all information about common DNA variation has been extracted. These data document the generality of recombination hotspots, a block-like structure of linkage disequilibrium and low haplotype diversity, leading to substantial correlations of SNPs with many of their neighbours. We show how the HapMap resource can guide the design and analysis of genetic association studies, shed light on structural variation and recombination, and identify loci that may have been subject to natural selection during human evolution. © 2005 Nature Publishing Group., link_to_OA_fulltext
- Published
- 2005
- Full Text
- View/download PDF
28. Identification of genetic risk variants for deep vein thrombosis by multiplexed next-generation sequencing of 186 hemostatic/pro-inflammatory genes
- Author
-
Pier Mannuccio Mannucci, Donna M. Muzny, Ida Martinelli, Humeira Akbar, Dario Consonni, Marzia Menegatti, Yuanqing Wu, Luca A. Lotta, Emanuela Pappalardo, Steven E. Scherer, Lora L Lewis, Fuli Yu, Matthew N. Bainbridge, Richard A. Gibbs, Serena M. Passamonti, Flora Peyvandi, Jin Yu, and Mark Wang
- Subjects
Nonsynonymous substitution ,Adult ,Male ,Candidate gene ,lcsh:Internal medicine ,lcsh:QH426-470 ,venous thromboembolism ,Pilot Projects ,rs6025 ,Biology ,DNA sequencing ,FGA ,Deep vein thrombosis ,Genetic variation ,Genetic predisposition ,Genetics ,Humans ,Genomic library ,Genetic Predisposition to Disease ,Genetics(clinical) ,lcsh:RC31-1245 ,Gene ,Genetics (clinical) ,Genetic association ,Inflammation ,Venous Thrombosis ,Hemostasis ,multiplexing ,target capture ,Genetic Variation ,High-Throughput Nucleotide Sequencing ,heamostateome ,Genomics ,Sequence Analysis, DNA ,lcsh:Genetics ,Case-Control Studies ,Female ,next-generation sequencing ,VTE ,DVT ,Research Article - Abstract
Background Next-generation DNA sequencing is opening new avenues for genetic association studies in common diseases that, like deep vein thrombosis (DVT), have a strong genetic predisposition still largely unexplained by currently identified risk variants. In order to develop sequencing and analytical pipelines for the application of next-generation sequencing to complex diseases, we conducted a pilot study sequencing the coding area of 186 hemostatic/proinflammatory genes in 10 Italian cases of idiopathic DVT and 12 healthy controls. Results A molecular-barcoding strategy was used to multiplex DNA target capture and sequencing, while retaining individual sequence information. Genomic libraries with barcode sequence-tags were pooled (in pools of 8 or 16 samples) and enriched for target DNA sequences. Sequencing was performed on ABI SOLiD-4 platforms. We produced > 12 gigabases of raw sequence data to sequence at high coverage (average: 42X) the 700-kilobase target area in 22 individuals. A total of 1876 high-quality genetic variants were identified (1778 single nucleotide substitutions and 98 insertions/deletions). Annotation on databases of genetic variation and human disease mutations revealed several novel, potentially deleterious mutations. We tested 576 common variants in a case-control association analysis, carrying the top-5 associations over to replication in up to 719 DVT cases and 719 controls. We also conducted an analysis of the burden of nonsynonymous variants in coagulation factor and anticoagulant genes. We found an excess of rare missense mutations in anticoagulant genes in DVT cases compared to controls and an association for a missense polymorphism of FGA (rs6050; p = 1.9 × 10-5, OR 1.45; 95% CI, 1.22-1.72; after replication in > 1400 individuals). Conclusions We implemented a barcode-based strategy to efficiently multiplex sequencing of hundreds of candidate genes in several individuals. In the relatively small dataset of our pilot study we were able to identify bona fide associations with DVT. Our study illustrates the potential of next-generation sequencing for the discovery of genetic variation predisposing to complex diseases.
- Published
- 2012
29. An integrative variant analysis suite for whole exome next-generation sequencing data
- Author
-
Danny Challis, Uday S. Evani, Andrew R. Jackson, Sameer Paithankar, Fuli Yu, Aleksandar Milosavljevic, Richard A. Gibbs, Jin Yu, and Cristian Coarfa
- Subjects
Cancer genome sequencing ,Genomics ,Computational biology ,Biology ,lcsh:Computer applications to medicine. Medical informatics ,Polymorphism, Single Nucleotide ,Biochemistry ,DNA sequencing ,03 medical and health sciences ,Open Reading Frames ,0302 clinical medicine ,INDEL Mutation ,Structural Biology ,Humans ,Exome ,1000 Genomes Project ,Molecular Biology ,lcsh:QH301-705.5 ,Exome sequencing ,030304 developmental biology ,Genetics ,0303 health sciences ,Genome, Human ,Applied Mathematics ,Suite ,DNA sequencing theory ,High-Throughput Nucleotide Sequencing ,Computer Science Applications ,lcsh:Biology (General) ,030220 oncology & carcinogenesis ,lcsh:R858-859.7 ,Software - Abstract
Background Whole exome capture sequencing allows researchers to cost-effectively sequence the coding regions of the genome. Although the exome capture sequencing methods have become routine and well established, there is currently a lack of tools specialized for variant calling in this type of data. Results Using statistical models trained on validated whole-exome capture sequencing data, the Atlas2 Suite is an integrative variant analysis pipeline optimized for variant discovery on all three of the widely used next generation sequencing platforms (SOLiD, Illumina, and Roche 454). The suite employs logistic regression models in conjunction with user-adjustable cutoffs to accurately separate true SNPs and INDELs from sequencing and mapping errors with high sensitivity (96.7%). Conclusion We have implemented the Atlas2 Suite and applied it to 92 whole exome samples from the 1000 Genomes Project. The Atlas2 Suite is available for download at http://sourceforge.net/projects/atlas2/. In addition to a command line version, the suite has been integrated into the Genboree Workbench, allowing biomedical scientists with minimal informatics expertise to remotely call, view, and further analyze variants through a simple web interface. The existing genomic databases displayed via the Genboree browser also streamline the process from variant discovery to functional genomics analysis, resulting in an off-the-shelf toolkit for the broader community.
- Published
- 2012
30. Extremely low-coverage whole genome sequencing in South Asians captures population genomics information.
- Author
-
Rustagi, Navin, Zhou, Anbo, Scott Watkins, W., Gedvilaite, Erika, Shuoguo Wang, Ramesh, Naveen, Muzny, Donna, Gibbs, Richard A., Jorde, Lynn B., Fuli Yu, and Jinchuan Xing
- Subjects
NUCLEOTIDE sequencing ,SIMULATION methods & models ,GENE frequency ,MITOCHONDRIAL DNA - Abstract
Background: The cost of Whole Genome Sequencing (WGS) has decreased tremendously in recent years due to advances in next-generation sequencing technologies. Nevertheless, the cost of carrying out large-scale cohort studies using WGS is still daunting. Past simulation studies with coverage at ~2x have shown promise for using low coverage WGS in studies focused on variant discovery, association study replications, and population genomics characterization. However, the performance of low coverage WGS in populations with a complex history and no reference panel remains to be determined. Results: South Indian populations are known to have a complex population structure and are an example of a major population group that lacks adequate reference panels. To test the performance of extremely low-coverage WGS (EXL-WGS) in populations with a complex history and to provide a reference resource for South Indian populations, we performed EXL-WGS on 185 South Indian individuals from eight populations to ~1.6x coverage. Using two variant discovery pipelines, SNPTools and GATK, we generated a consensus call set that has ~90% sensitivity for identifying common variants (minor allele frequency ≥ 10%). Imputation further improves the sensitivity of our call set. In addition, we obtained high-coverage for the whole mitochondrial genome to infer the maternal lineage evolutionary history of the Indian samples. Conclusions: Overall, we demonstrate that EXL-WGS with imputation can be a valuable study design for variant discovery with a dramatically lower cost than standard WGS, even in populations with a complex history and without available reference data. In addition, the South Indian EXL-WGS data generated in this study will provide a valuable resource for future Indian genomic studies. [ABSTRACT FROM AUTHOR]
- Published
- 2017
- Full Text
- View/download PDF
31. PTPN11 mutations in Noonan syndrome type I: detection of recurrent mutations in exons 3 and 13
- Author
-
M. Maheshwari, Laura Molinari, Susan D. Fernbach, William J. Craigen, Fuli Yu, Jeffrey A. Towbin, Imtiaz Yakub, Richard A. Gibbs, A. Combes, T. Ho, and John W. Belmont
- Subjects
musculoskeletal diseases ,Male ,congenital, hereditary, and neonatal diseases and abnormalities ,SH2 Domain-Containing Protein Tyrosine Phosphatases ,DNA Mutational Analysis ,Protein Tyrosine Phosphatase, Non-Receptor Type 11 ,Biology ,medicine.disease_cause ,src Homology Domains ,Exon ,Locus heterogeneity ,Recurrence ,Catalytic Domain ,Genetics ,medicine ,Missense mutation ,Humans ,Allele ,skin and connective tissue diseases ,Protein Structure, Quaternary ,Genetics (clinical) ,Mutation ,Noonan Syndrome ,Intracellular Signaling Peptides and Proteins ,Exons ,medicine.disease ,PTPN11 ,Isoenzymes ,Phenotype ,Pulmonary valve stenosis ,Noonan syndrome ,Female ,Protein Tyrosine Phosphatases - Abstract
We surveyed 16 subjects with the clinical diagnosis of Noonan Syndrome (NS1) from 12 families and their relevant family members for mutations in PTPN11/SHP2 using direct DNA sequencing. We found three different mutations among five families. Two unrelated subjects shared the same de novo missense substitution in exon 13 (S502T); an additional two unrelated families had a mutation in exon 3 (Y63C); and one subject had the amino acid substitution Y62D, also in exon 3. None of the three mutations were present in ethnically matched controls. In the mature protein model, the exon 3 mutants and the exon 13 mutant amino acids cluster at the interface between the N' SH2 domain and the phosphatase catalytic domain. Six of eight subjects with PTPN11/SHP2 mutations had pulmonary valve stenosis while no mutations were identified in those subjects (N = 4) with hypertrophic cardiomyopathy. An additional four subjects with possible Noonan syndrome were evaluated, but no mutations in PTPN11/SHP2 were identified. These results confirm that mutations in PTPN11/SHP2 underlie a common form of Noonan syndrome, and that the disease exhibits both allelic and locus heterogeneity. The observation of recurrent mutations supports the hypothesis that a special class of gain-of-function mutations in SHP2 give rise to Noonan syndrome.
- Published
- 2002
32. A hybrid computational strategy to address WGS variant analysis in >5000 samples.
- Author
-
Zhuoyi Huang, Rustagi, Navin, Veeraraghavan, Narayanan, Carroll, Andrew, Gibbs, Richard, Boerwinkle, Eric, Venkata, Manjunath Gorentla, and Fuli Yu
- Subjects
NUCLEOTIDE sequencing ,COMPUTERS in biology ,CLOUD computing ,SUPERCOMPUTERS ,BIG data ,HUMAN genome - Abstract
Background: The decreasing costs of sequencing are driving the need for cost effective and real time variant calling of whole genome sequencing data. The scale of these projects are far beyond the capacity of typical computing resources available with most research labs. Other infrastructures like the cloud AWS environment and supercomputers also have limitations due to which large scale joint variant calling becomes infeasible, and infrastructure specific variant calling strategies either fail to scale up to large datasets or abandon joint calling strategies. Results: We present a high throughput framework including multiple variant callers for single nucleotide variant (SNV) calling, which leverages hybrid computing infrastructure consisting of cloud AWS, supercomputers and local high performance computing infrastructures. We present a novel binning approach for large scale joint variant calling and imputation which can scale up to over 10,000 samples while producing SNV callsets with high sensitivity and specificity. As a proof of principle, we present results of analysis on Cohorts for Heart And Aging Research in Genomic Epidemiology (CHARGE) WGS freeze 3 dataset in which joint calling, imputation and phasing of over 5300 whole genome samples was produced in under 6 weeks using four state-of-the-art callers. The callers used were SNPTools, GATK-HaplotypeCaller, GATK-UnifiedGenotyper and GotCloud. We used Amazon AWS, a 4000-core in-house cluster at Baylor College of Medicine, IBM power PC Blue BioU at Rice and Rhea at Oak Ridge National Laboratory (ORNL) for the computation. AWS was used for joint calling of 180 TB of BAM files, and ORNL and Rice supercomputers were used for the imputation and phasing step. All other steps were carried out on the local compute cluster. The entire operation used 5.2 million core hours and only transferred a total of 6 TB of data across the platforms. Conclusions: Even with increasing sizes of whole genome datasets, ensemble joint calling of SNVs for low coverage data can be accomplished in a scalable, cost effective and fast manner by using heterogeneous computing platforms without compromising on the quality of variants. [ABSTRACT FROM AUTHOR]
- Published
- 2016
- Full Text
- View/download PDF
33. Self-forming TiBN Nanocomposite Multilayer Coating Prepared by Pulse Cathode Arc Method.
- Author
-
Yongzhi Cao, Zhenjiang Hu, Leilei Yan, Fuli Yu, and Wendi Tu
- Subjects
NANOSTRUCTURED materials ,TITANIUM compounds ,COATING processes ,SILICON compounds ,TEMPERATURE effect - Abstract
Novel multilayer structured TiBN coatings were deposited on Si (100) substrate using TiBN complex cathode plasma immersion ion implantation and deposition technique (PIIID). The coatings were characterized by X-ray diffraction (XRD), high-resolution transmission electron microcopy (HRTEM), energy-dispersive spectrometer (EDS) and ball-ondisk test. XRD results reveal that both samples of TiBN coatings have the main diffraction peak of TiN (200) and (220). Cross-section TEM images reveal that these coatings have the character of self-forming multilayer and consists of face-centered cubic TiN and hexagonal BN nanocrystalline embedded in amorphous matrix. Because of the existence of hexagonal BN, the friction coefficient of the new TiBN coating in room temperature is obviously lower than that of the monolithic TiN nanocrystalline coating. [ABSTRACT FROM AUTHOR]
- Published
- 2016
- Full Text
- View/download PDF
34. PacBio-LITS: a large-insert targeted sequencing method for characterization of human disease-associated chromosomal structural variations.
- Author
-
Min Wang, Beck, Christine R., English, Adam C., Qingchang Meng, Buhay, Christian, Yi Han, Doddapaneni, Harsha V., Fuli Yu, Boerwinkle, Eric, Lupski, James R., Muzny, Donna M., and Gibbs, Richard A.
- Subjects
NUCLEOTIDE sequencing ,CHROMOSOME structure ,GENE targeting ,GENETIC disorder treatment ,MICROSATELLITE repeats - Abstract
Background: Generation of long (>5 Kb) DNA sequencing reads provides an approach for interrogation of complex regions in the human genome. Currently, large-insert whole genome sequencing (WGS) technologies from Pacific Biosciences (PacBio) enable analysis of chromosomal structural variations (SVs), but the cost to achieve the required sequence coverage across the entire human genome is high. Results: We developed a method (termed PacBio-LITS) that combines oligonucleotide-based DNA target-capture enrichment technologies with PacBio large-insert library preparation to facilitate SV studies at specific chromosomal regions. PacBio-LITS provides deep sequence coverage at the specified sites at substantially reduced cost compared with PacBio WGS. The efficacy of PacBio-LITS is illustrated by delineating the breakpoint junctions of low copy repeat (LCR)-associated complex structural rearrangements on chr17p11.2 in patients diagnosed with Potocki-Lupski syndrome (PTLS; MIM#610883). We successfully identified previously determined breakpoint junctions in three PTLS cases, and also were able to discover novel junctions in repetitive sequences, including LCR-mediated breakpoints. The new information has enabled us to propose mechanisms for formation of these structural variants. Conclusions: The new method leverages the cost efficiency of targeted capture-sequencing as well as the mappability and scaffolding capabilities of long sequencing reads generated by the PacBio platform. It is therefore suitable for studying complex SVs, especially those involving LCRs, inversions, and the generation of chimeric Alu elements at the breakpoints. Other genomic research applications, such as haplotype phasing and small insertion and deletion validation could also benefit from this technology. [ABSTRACT FROM AUTHOR]
- Published
- 2015
- Full Text
- View/download PDF
35. The distribution and mutagenesis of short coding INDELs from 1,128 whole exomes.
- Author
-
Challis, Danny, Antunes, Lilian, Garrison, Erik, Banks, Eric, Evani, Uday S., Muzny, Donna, Poplin, Ryan, Gibbs, Richard A., Marth, Gabor, and Fuli Yu
- Subjects
MUTAGENESIS ,GENOMES ,FALSE discovery rate ,GENE frequency ,MACHINE learning - Abstract
Background: Identifying insertion/deletion polymorphisms (INDELs) with high confidence has been intrinsically challenging in short-read sequencing data. Here we report our approach for improving INDEL calling accuracy by using a machine learning algorithm to combine call sets generated with three independent methods, and by leveraging the strengths of each individual pipeline. Utilizing this approach, we generated a consensus exome INDEL call set from a large dataset generated by the 1000 Genomes Project (1000G), maximizing both the sensitivity and the specificity of the calls. Results: This consensus exome INDEL call set features 7,210 INDELs, from 1,128 individuals across 13 populations included in the 1000 Genomes Phase 1 dataset, with a false discovery rate (FDR) of about 7.0%. Conclusions: In our study we further characterize the patterns and distributions of these exonic INDELs with respect to density, allele length, and site frequency spectrum, as well as the potential mutagenic mechanisms of coding INDELs in humans. [ABSTRACT FROM AUTHOR]
- Published
- 2015
- Full Text
- View/download PDF
36. Launching genomics into the cloud: deployment of Mercury, a next generation sequence analysis pipeline.
- Author
-
Reid, Jeffrey G., Carroll, Andrew, Veeraraghavan, Narayanan, Dahdouli, Mahmoud, Sundquist, Andreas, English, Adam, Bainbridge, Matthew, White, Simon, Salerno, William, Buhay, Christian, Fuli Yu, Muzny, Donna, Daly, Richard, Duyk, Geoff, Gibbs, Richard A., and Boerwinkle, Eric
- Subjects
NUCLEOTIDE sequence ,BIG data ,MOLECULAR genetics ,COMPUTATIONAL biology ,WORKFLOW software ,CLOUD computing ,WEB services - Abstract
Background Massively parallel DNA sequencing generates staggering amounts of data. Decreasing cost, increasing throughput, and improved annotation have expanded the diversity of genomics applications in research and clinical practice. This expanding scale creates analytical challenges: accommodating peak compute demand, coordinating secure access for multiple analysts, and sharing validated tools and results. Results To address these challenges, we have developed the Mercury analysis pipeline and deployed it in local hardware and the Amazon Web Services cloud via the DNAnexus platform. Mercury is an automated, flexible, and extensible analysis workflow that provides accurate and reproducible genomic results at scales ranging from individuals to large cohorts. Conclusions By taking advantage of cloud computing and with Mercury implemented on the DNAnexus platform, we have demonstrated a powerful combination of a robust and fully validated software pipeline and a scalable computational resource that, to date, we have applied to more than 10,000 whole genome and whole exome samples. [ABSTRACT FROM AUTHOR]
- Published
- 2014
- Full Text
- View/download PDF
37. The 1000 Genomes Project: paving the way for personalized genomic medicine.
- Author
-
Gibson, Ian B., Rongcai Jiang, and Fuli Yu
- Published
- 2013
- Full Text
- View/download PDF
38. Translational signatures and mRNA levels are highly correlated in human stably expressed genes.
- Author
-
Line, Sergio R. P., Xiaoming Liu, De Souza, Ana Paula, and Fuli Yu
- Subjects
MESSENGER RNA ,GENE expression ,TRANSFER RNA ,GENETIC regulation ,CRYOBIOLOGY - Abstract
Background: Gene expression is one of the most relevant biological processes of living cells. Due to the relative small population sizes, it is predicted that human gene sequences are not strongly influenced by selection towards expression efficiency. One of the major problems in estimating to what extent gene characteristics can be selected to maximize expression efficiency is the wide variation that exists in RNA and protein levels among physiological states and different tissues. Analyses of datasets of stably expressed genes (i.e. with consistent expression between physiological states and tissues) would provide more accurate and reliable measurements of associations between variations of a specific gene characteristic and expression, and how distinct gene features work to optimize gene expression. Results: Using a dataset of human genes with consistent expression between physiological states we selected gene sequence signatures related to translation that can predict about 42% of mRNA variation. The prediction can be increased to 51% when selecting genes that are stably expressed in more than 1 tissue. These genes are enriched for translation and ribosome biosynthesis processes and have higher translation efficiency scores, smaller coding sequences and 3' UTR sizes and lower folding energies when compared to other datasets. Additionally, the amino acid frequencies weighted by expression showed higher correlations with isoacceptor tRNA gene copy number, and smaller absolute correlation values with biosynthetic costs. Conclusion: Our results indicate that human gene sequence characteristics related to transcription and translation processes can co-evolve in an integrated manner in order to optimize gene expression. [ABSTRACT FROM AUTHOR]
- Published
- 2013
- Full Text
- View/download PDF
39. Characterizing linkage disequilibrium and evaluating imputation power of human genomic insertion-deletion polymorphisms.
- Author
-
Lu, James T., Yi Wang, Gibbs, Richard A., and Fuli Yu
- Published
- 2012
- Full Text
- View/download PDF
40. Atlas2 Cloud: a framework for personal genome analysis in the cloud.
- Author
-
Evani, Uday S., Challis, Danny, Jin Yu, Jackson, Andrew R., Paithankar, Sameer, Bainbridge, Matthew N., Jakkamsetti, Adinarayana, Pham, Peter, Coarfa, Cristian, Milosavljevic, Aleksandar, and Fuli Yu
- Subjects
GENOMES ,GENOMICS ,CLOUD computing ,WEB services ,CLOUD storage - Abstract
Background: Until recently, sequencing has primarily been carried out in large genome centers which have invested heavily in developing the computational infrastructure that enables genomic sequence analysis. The recent advancements in next generation sequencing (NGS) have led to a wide dissemination of sequencing technologies and data, to highly diverse research groups. It is expected that clinical sequencing will become part of diagnostic routines shortly. However, limited accessibility to computational infrastructure and high quality bioinformatic tools, and the demand for personnel skilled in data analysis and interpretation remains a serious bottleneck. To this end, the cloud computing and Software-as-a-Service (SaaS) technologies can help address these issues. Results: We successfully enabled the Atlas2 Cloud pipeline for personal genome analysis on two different cloud service platforms: a community cloud via the Genboree Workbench, and a commercial cloud via the Amazon Web Services using Software-as-a-Service model. We report a case study of personal genome analysis using our Atlas2 Genboree pipeline. We also outline a detailed cost structure for running Atlas2 Amazon on whole exome capture data, providing cost projections in terms of storage, compute and I/O when running Atlas2 Amazon on a large data set. Conclusions: We find that providing a web interface and an optimized pipeline clearly facilitates usage of cloud computing for personal genome analysis, but for it to be routinely used for large scale projects there needs to be a paradigm shift in the way we develop tools, in standard operating procedures, and in funding mechanisms [ABSTRACT FROM AUTHOR]
- Published
- 2012
- Full Text
- View/download PDF
41. The functional spectrum of low-frequency coding variation.
- Author
-
Marth, Gabor T., Fuli Yu, Indap, Amit R., Garimella, Kiran, Gravel, Simon, Leong, Wen Fung, Tyler-Smith, Chris, Bainbridge, Matthew, Blackwell, Tom, Zheng-Bradley, Xiangqun, Chen, Yuan, Challis, Danny, Clarke, Laura, Ball, Edward V., Cibulskis, Kristian, Cooper, David N., Fulton, Bob, Hartl, Chris, Koboldt, Dan, and Muzny, Donna
- Published
- 2011
- Full Text
- View/download PDF
42. Demographic history and rare allele sharing among human populations.
- Author
-
Gravel, Simon, Henn, Brenna M., Gutenkunst, Ryan N., Indap, Amit R., Marth3, Gabor T., Clark, Andrew G., Fuli Yu, Gibbs, Richard A., and Bustamante, Carlos D.
- Subjects
HUMAN genome ,GENOMICS ,POPULATION genetics ,HUMAN evolution - Abstract
High-throughput sequencing technology enables population-level surveys of human genomic variation. Here, we examine the joint allele frequency distributions across continental human populations and present an approach for combining complementary aspects of whole-genome, low-coverage data and targeted high-coverage data. We apply this approach to data generated by the pilot phase of the Thousand Genomes Project, including whole-genome 2-4× coverage data for 179 samples from HapMap European, Asian, and African panels as well as high-coverage target sequencing of the exons of 800 genes from 697 individuals in seven populations. We use the site frequency spectra obtained from these data to infer demographic parameters for an Out-of-Africa model for populations of African, European, and Asian descent and to predict, by a jackknife-based approach, the amount of genetic diversity that will be discovered as sample sizes are increased. We predict that the number of discovered nonsynonymous coding variants will reach 100,000 in each population after ∼1,000 sequenced chromosomes per population, whereas ∼2,500 chromosomes will be needed for the same number of synonymous variants. Beyond this point, the number of segregating sites in the European and Asian panel populations is expected to overcome that of the African panel because of faster recent population growth. Overall, we find that the majority of human genomic variable sites are rare and exhibit little sharing among diverged populations. Our results emphasize that replication of disease association for specific rare genetic variants across diverged populations must overcome both reduced statistical power because of rarity and higher population divergence. [ABSTRACT FROM AUTHOR]
- Published
- 2011
- Full Text
- View/download PDF
43. Characterization of single-nucleotide variation in Indian-origin rhesus macaques (Macaca mulatta).
- Author
-
Fawcett, Gloria L., Raveendran, Muthuswamy, Rio Deiros, David, Chen, David, Fuli Yu, Harris, Ronald Alan, Ren, Yanru, Muzny, Donna M., Reid, Jeffrey G., Wheeler, David A., Worley, Kimberly C., Shelton, Steven E., Kalin, Ned H., Milosavljevic, Aleksandar, Gibbs, Richard, and Rogers, Jeffrey
- Subjects
NUCLEOTIDES ,GENETIC polymorphisms ,RHESUS monkeys ,GENETICS ,GENOMES - Abstract
Background: Rhesus macaques are the most widely utilized nonhuman primate model in biomedical research. Previous efforts have validated fewer than 900 single nucleotide polymorphisms (SNPs) in this species, which limits opportunities for genetic studies related to health and disease. Extensive information about SNPs and other genetic variation in rhesus macaques would facilitate valuable genetic analyses, as well as provide markers for genome-wide linkage analysis and the genetic management of captive breeding colonies. Results: We used the available rhesus macaque draft genome sequence, new sequence data from unrelated individuals and existing published sequence data to create a genome-wide SNP resource for Indian-origin rhesus monkeys. The original reference animal and two additional Indian-origin individuals were resequenced to low coverage using SOLiD™ sequencing. We then used three strategies to validate SNPs: comparison of potential SNPs found in the same individual using two different sequencing chemistries, and comparison of potential SNPs in different individuals identified with either the same or different sequencing chemistries. Our approach validated approximately 3 million SNPs distributed across the genome. Preliminary analysis of SNP annotations suggests that a substantial number of these macaque SNPs may have functional effects. More than 700 non-synonymous SNPs were scored by Polyphen-2 as either possibly or probably damaging to protein function and these variants now constitute potential models for studying functional genetic variation relevant to human physiology and disease. Conclusions: Resequencing of a small number of animals identified greater than 3 million SNPs. This provides a significant new information resource for rhesus macaques, an important research animal. The data also suggests that overall genetic variation is high in this species. We identified many potentially damaging non-synonymous coding SNPs, providing new opportunities to identify rhesus models for human disease. [ABSTRACT FROM AUTHOR]
- Published
- 2011
- Full Text
- View/download PDF
44. Pash 3.0: A versatile software package for read mapping and integrative analysis of genomic and epigenomic variation using massively parallel DNA sequencing.
- Author
-
Coarfa, Cristian, Fuli Yu, Miller, Christopher A., Zuozhou Chen, Harris, RAlan, and Milosavljevic, Aleksandar
- Subjects
- *
NUCLEOTIDE sequence , *COMPUTER software , *GENOMICS - Abstract
Background: Massively parallel sequencing readouts of epigenomic assays are enabling integrative genome-wide analyses of genomic and epigenomic variation. Pash 3.0 performs sequence comparison and read mapping and can be employed as a module within diverse configurable analysis pipelines, including ChIP-Seq and methylome mapping by whole-genome bisulfite sequencing. Results: Pash 3.0 generally matches the accuracy and speed of niche programs for fast mapping of short reads, and exceeds their performance on longer reads generated by a new generation of massively parallel sequencing technologies. By exploiting longer read lengths, Pash 3.0 maps reads onto the large fraction of genomic DNA that contains repetitive elements and polymorphic sites, including indel polymorphisms. Conclusions: We demonstrate the versatility of Pash 3.0 by analyzing the interaction between CpG methylation, CpG SNPs, and imprinting based on publicly available whole-genome shotgun bisulfite sequencing data. Pash 3.0 makes use of gapped k-mer alignment, a non-seed based comparison method, which is implemented using multipositional hash tables. This allows Pash 3.0 to run on diverse hardware platforms, including individual computers with standard RAM capacity, multi-core hardware architectures and large clusters. [ABSTRACT FROM AUTHOR]
- Published
- 2010
- Full Text
- View/download PDF
45. Highly multiplexed molecular inversion probe genotyping: Over 10,000 targeted SNPs genotyped in a single tube assay.
- Author
-
Hardenbol, Paul, Fuli Yu, Belmont, John, MacKenzie, Jennifer, Bruckner, Carsten, Brundage, Tiffany, Boudreau, Andrew, Chow, Steve, Eberle, Jim, Erbilgin, Ayca, Falkowski, Mat, Fitzgerald, Ron, Sy Ghose, Lartchouk, Oleg, Jam, Maneesh, Karlin-Neumann, George, Xiuhua Lu, Xin Miao, Moore, Bridget, and Moorhead, Martin
- Subjects
- *
GENETICS , *GENE mapping , *GENOMES , *GENOMICS , *GENETIC polymorphisms , *GENETIC mutation - Abstract
Large-scale genetic studies are highly dependent on efficient and scalable multiplex SNP assays. In this study, we report the development of Molecular Inversion Probe technology with four-color, single array detection, applied to large-scale genotyping of up to 12,000 SNPs per reaction. While generating 38,429 SNP assays using this technology in a population of 30 trios from the Centre d'Etude Polymorphisme Humain family panel as part of the International HapMap project, we established SNP conversion rates of ∼90% with concordance rates >99.6% and completeness levels >98% for assays multiplexed up to 12,000plex levels. Furthermore, these individual metrics can be "traded off" and, by sacrificing a small fraction of the conversion rate, the accuracy can be increased to very high levels. No loss of performance is seen when scaling from 6,000plex to 12,000plex assays, strongly validating the ability of the technology to suppress cross-reactivity at high multiplex levels. The results of this study demonstrate the suitability of this technology for comprehensive association studies that use targeted SNPs in indirect linkage disequilibrium studies or that directly screen for causative mutations. [ABSTRACT FROM AUTHOR]
- Published
- 2005
- Full Text
- View/download PDF
46. A Genomewide Admixture Map for Latino Populations
- Author
-
Laura Riba, Alberto Villegas, Carlos A. Aguilar-Salinas, Gavin J. McDonald, Arti Tandon, Andres Ruiz-Linares, Alicja Waliszewska, Samuel Canizales-Quinteros, Gabriel Bedoya, Christopher A. Haiman, Cheryl A. Winkler, Francisco M. Salzano, Carla Gallo, Marcela K. Tello-Ruiz, Christine Schirmer, Nick Patterson, Fuli Yu, Julie Neubauer, David Cox, Maria Cátira Bortolini, Constanza Duque, Guido Mazzotti, Brian E. Henderson, Alkes L. Price, Teresa Tusié-Luna, Marta Menjivar, David Reich, and William Klitz
- Subjects
Linkage disequilibrium ,Cromosomas Humanos ,0302 clinical medicine ,Databases, Genetic ,Chromosomes, Human ,Genetics(clinical) ,Genetics (clinical) ,African Continental Ancestry Group ,Genetics ,0303 health sciences ,education.field_of_study ,medicine.diagnostic_test ,Estudios de Casos y Controles ,Chromosome Mapping ,Hispanic or Latino ,Marcadores Genéticos ,Grupo de Ascendencia Continental Africana ,Alelos ,Genetic Markers ,Population ,Genetic admixture ,Black People ,Predisposición Genética a la Enfermedad ,Locus (genetics) ,Biology ,Article ,White People ,03 medical and health sciences ,Gene mapping ,medicine ,Humans ,Computer Simulation ,Genetic Predisposition to Disease ,Genetic Testing ,Genética de Población ,education ,Alleles ,030304 developmental biology ,Genetic testing ,Genetic association ,Bases de Datos Genéticas ,Genome, Human ,Reproducibility of Results ,Mapeo Cromosómico ,Indios Norteamericanos ,Genetics, Population ,Genetic marker ,Case-Control Studies ,Indians, North American ,030217 neurology & neurosurgery - Abstract
Admixture mapping is an economical and powerful approach for localizing disease genes in populations of recently mixed ancestry and has proven successful in African Americans. The method holds equal promise for Latinos, who typically inherit a mix of European, Native American, and African ancestry. However, admixture mapping in Latinos has not been practical because of the lack of a map of ancestry-informative markers validated in Native American and other populations. To address this, we screened multiple databases, containing millions of markers, to identify 4,186 markers that were putatively informative for determining the ancestry of chromosomal segments in Latino populations. We experimentally validated each of these markers in at least 232 new Latino, European, Native American, and African samples, and we selected a subset of 1,649 markers to form an admixture map. An advantage of our strategy is that we focused our map on markers distinguishing Native American from other ancestries and restricted it to markers with very similar frequencies in Europeans and Africans, which decreased the number of markers needed and minimized the possibility of false disease associations. We evaluated the effectiveness of our map for localizing disease genes in four Latino populations from both North and South America. COL0006723
- Full Text
- View/download PDF
47. PacBio-LITS: a large-insert targeted sequencing method for characterization of human disease-associated chromosomal structural variations
- Author
-
Yi Han, James R. Lupski, Donna M. Muzny, Eric Boerwinkle, Harsha Doddapaneni, Fuli Yu, Christine R. Beck, Richard A. Gibbs, Min Wang, Qingchang Meng, Adam C. English, and Christian J. Buhay
- Subjects
Chromosome Aberrations ,Gene Rearrangement ,Genetics ,Cancer genome sequencing ,Whole genome sequencing ,Shotgun sequencing ,Methodology Article ,High-Throughput Nucleotide Sequencing ,Hybrid genome assembly ,Genomics ,Genome project ,Biology ,Deep sequencing ,Single molecule sequencing ,Workflow ,Targeted sequencing ,Humans ,Complex genomic rearrangement ,Genetic Association Studies ,Exome sequencing ,Gene Library ,Biotechnology - Abstract
Background Generation of long (>5 Kb) DNA sequencing reads provides an approach for interrogation of complex regions in the human genome. Currently, large-insert whole genome sequencing (WGS) technologies from Pacific Biosciences (PacBio) enable analysis of chromosomal structural variations (SVs), but the cost to achieve the required sequence coverage across the entire human genome is high. Results We developed a method (termed PacBio-LITS) that combines oligonucleotide-based DNA target-capture enrichment technologies with PacBio large-insert library preparation to facilitate SV studies at specific chromosomal regions. PacBio-LITS provides deep sequence coverage at the specified sites at substantially reduced cost compared with PacBio WGS. The efficacy of PacBio-LITS is illustrated by delineating the breakpoint junctions of low copy repeat (LCR)-associated complex structural rearrangements on chr17p11.2 in patients diagnosed with Potocki–Lupski syndrome (PTLS; MIM#610883). We successfully identified previously determined breakpoint junctions in three PTLS cases, and also were able to discover novel junctions in repetitive sequences, including LCR-mediated breakpoints. The new information has enabled us to propose mechanisms for formation of these structural variants. Conclusions The new method leverages the cost efficiency of targeted capture-sequencing as well as the mappability and scaffolding capabilities of long sequencing reads generated by the PacBio platform. It is therefore suitable for studying complex SVs, especially those involving LCRs, inversions, and the generation of chimeric Alu elements at the breakpoints. Other genomic research applications, such as haplotype phasing and small insertion and deletion validation could also benefit from this technology. Electronic supplementary material The online version of this article (doi:10.1186/s12864-015-1370-2) contains supplementary material, which is available to authorized users.
- Full Text
- View/download PDF
48. Characterization of single-nucleotide variation in Indian-origin rhesus macaques (Macaca mulatta)
- Author
-
Ned H. Kalin, Yanru Ren, Jeffrey Rogers, Donna M. Muzny, Muthuswamy Raveendran, Aleksandar Milosavljevic, Gloria L. Fawcett, Ronald A. Harris, David K. Chen, David Rio Deiros, David A. Wheeler, Fuli Yu, Steven E. Shelton, Jeffrey G. Reid, Richard A. Gibbs, and Kimberly C Worley
- Subjects
lcsh:QH426-470 ,lcsh:Biotechnology ,India ,Single-nucleotide polymorphism ,Computational biology ,Genome ,Macaque ,Polymorphism, Single Nucleotide ,03 medical and health sciences ,0302 clinical medicine ,Species Specificity ,Genetic linkage ,single nucleotide polymorphism ,lcsh:TP248.13-248.65 ,biology.animal ,Genetic variation ,common variants ,SOLiD™ ,Genetics ,SNP ,Animals ,030304 developmental biology ,Whole genome sequencing ,0303 health sciences ,biology ,Sequence Analysis, DNA ,biology.organism_classification ,Macaca mulatta ,lcsh:Genetics ,Rhesus macaque ,genetic variation ,030217 neurology & neurosurgery ,Research Article ,rhesus macaque ,Biotechnology - Abstract
Background Rhesus macaques are the most widely utilized nonhuman primate model in biomedical research. Previous efforts have validated fewer than 900 single nucleotide polymorphisms (SNPs) in this species, which limits opportunities for genetic studies related to health and disease. Extensive information about SNPs and other genetic variation in rhesus macaques would facilitate valuable genetic analyses, as well as provide markers for genome-wide linkage analysis and the genetic management of captive breeding colonies. Results We used the available rhesus macaque draft genome sequence, new sequence data from unrelated individuals and existing published sequence data to create a genome-wide SNP resource for Indian-origin rhesus monkeys. The original reference animal and two additional Indian-origin individuals were resequenced to low coverage using SOLiD™ sequencing. We then used three strategies to validate SNPs: comparison of potential SNPs found in the same individual using two different sequencing chemistries, and comparison of potential SNPs in different individuals identified with either the same or different sequencing chemistries. Our approach validated approximately 3 million SNPs distributed across the genome. Preliminary analysis of SNP annotations suggests that a substantial number of these macaque SNPs may have functional effects. More than 700 non-synonymous SNPs were scored by Polyphen-2 as either possibly or probably damaging to protein function and these variants now constitute potential models for studying functional genetic variation relevant to human physiology and disease. Conclusions Resequencing of a small number of animals identified greater than 3 million SNPs. This provides a significant new information resource for rhesus macaques, an important research animal. The data also suggests that overall genetic variation is high in this species. We identified many potentially damaging non-synonymous coding SNPs, providing new opportunities to identify rhesus models for human disease.
- Full Text
- View/download PDF
49. Translational signatures and mRNA levels are highly correlated in human stably expressed genes
- Author
-
Fuli Yu, Ana Paula de Souza, Xiaoming Liu, and Sergio R P Line
- Subjects
Untranslated region ,Genetics ,education.field_of_study ,DNA, Complementary ,Base Sequence ,Genome, Human ,Population ,Gene Expression ,Biology ,Evolution, Molecular ,RNA, Transfer ,Protein Biosynthesis ,Databases, Genetic ,Gene expression ,Protein biosynthesis ,Humans ,Human genome ,RNA, Messenger ,Copy-number variation ,DNA microarray ,education ,Gene ,Research Article ,Biotechnology - Abstract
Background Gene expression is one of the most relevant biological processes of living cells. Due to the relative small population sizes, it is predicted that human gene sequences are not strongly influenced by selection towards expression efficiency. One of the major problems in estimating to what extent gene characteristics can be selected to maximize expression efficiency is the wide variation that exists in RNA and protein levels among physiological states and different tissues. Analyses of datasets of stably expressed genes (i.e. with consistent expression between physiological states and tissues) would provide more accurate and reliable measurements of associations between variations of a specific gene characteristic and expression, and how distinct gene features work to optimize gene expression. Results Using a dataset of human genes with consistent expression between physiological states we selected gene sequence signatures related to translation that can predict about 42% of mRNA variation. The prediction can be increased to 51% when selecting genes that are stably expressed in more than 1 tissue. These genes are enriched for translation and ribosome biosynthesis processes and have higher translation efficiency scores, smaller coding sequences and 3′ UTR sizes and lower folding energies when compared to other datasets. Additionally, the amino acid frequencies weighted by expression showed higher correlations with isoacceptor tRNA gene copy number, and smaller absolute correlation values with biosynthetic costs. Conclusion Our results indicate that human gene sequence characteristics related to transcription and translation processes can co-evolve in an integrated manner in order to optimize gene expression.
- Full Text
- View/download PDF
50. Launching genomics into the cloud: deployment of Mercury, a next generation sequence analysis pipeline
- Author
-
Simon D. M. White, Jeffrey G. Reid, Matthew N. Bainbridge, Richard A. Gibbs, William J Salerno, Richard Daly, Christian J. Buhay, Donna M. Muzny, Andreas Sundquist, Eric Boerwinkle, Andrew Carroll, Fuli Yu, Narayanan Veeraraghavan, Mahmoud Dahdouli, Geoff Duyk, and Adam C. English
- Subjects
Annotation ,Distributed computing ,Cloud computing ,Biology ,Computational resource ,Biochemistry ,World Wide Web ,03 medical and health sciences ,Software ,Structural Biology ,Variant calling ,Humans ,Clinical sequencing ,Massively parallel ,Molecular Biology ,030304 developmental biology ,Internet ,0303 health sciences ,NGS data ,Genome ,business.industry ,Methodology Article ,Applied Mathematics ,030305 genetics & heredity ,High-Throughput Nucleotide Sequencing ,Genomics ,Computer Science Applications ,Workflow ,Software deployment ,Scalability ,The Internet ,business - Abstract
Background Massively parallel DNA sequencing generates staggering amounts of data. Decreasing cost, increasing throughput, and improved annotation have expanded the diversity of genomics applications in research and clinical practice. This expanding scale creates analytical challenges: accommodating peak compute demand, coordinating secure access for multiple analysts, and sharing validated tools and results. Results To address these challenges, we have developed the Mercury analysis pipeline and deployed it in local hardware and the Amazon Web Services cloud via the DNAnexus platform. Mercury is an automated, flexible, and extensible analysis workflow that provides accurate and reproducible genomic results at scales ranging from individuals to large cohorts. Conclusions By taking advantage of cloud computing and with Mercury implemented on the DNAnexus platform, we have demonstrated a powerful combination of a robust and fully validated software pipeline and a scalable computational resource that, to date, we have applied to more than 10,000 whole genome and whole exome samples.
- Full Text
- View/download PDF
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.