Back to Search
Start Over
Comprehensive functional genomic resource and integrative model for the human brain
- Source :
- Science (New York, N.Y.), vol 362, iss 6420
- Publication Year :
- 2018
- Publisher :
- eScholarship, University of California, 2018.
-
Abstract
- INTRODUCTION Strong genetic associations have been found for a number of psychiatric disorders. However, understanding the underlying molecular mechanisms remains challenging. RATIONALE To address this challenge, the PsychENCODE Consortium has developed a comprehensive online resource and integrative models for the functional genomics of the human brain. RESULTS The base of the pyramidal resource is the datasets generated by PsychENCODE, including bulk transcriptome, chromatin, genotype, and Hi-C datasets and single-cell transcriptomic data from ~32,000 cells for major brain regions. We have merged these with data from Genotype-Tissue Expression (GTEx), ENCODE, Roadmap Epigenomics, and single-cell analyses. Via uniform processing, we created a harmonized resource, allowing us to survey functional genomics data on the brain over a sample size of 1866 individuals. From this uniformly processed dataset, we created derived data products. These include lists of brain-expressed genes, coexpression modules, and single-cell expression profiles for many brain cell types; ~79,000 brain-active enhancers with associated Hi-C loops and topologically associating domains; and ~2.5 million expression quantitative-trait loci (QTLs) comprising ~238,000 linkage-disequilibrium–independent single-nucleotide polymorphisms and of other types of QTLs associated with splice isoforms, cell fractions, and chromatin activity. By using these, we found that >88% of the cross-population variation in brain gene expression can be accounted for by cell fraction changes. Furthermore, a number of disorders and aging are associated with changes in cell-type proportions. The derived data also enable comparison between the brain and other tissues. In particular, by using spectral analyses, we found that the brain has distinct expression and epigenetic patterns, including a greater extent of noncoding transcription than other tissues. The top level of the resource consists of integrative networks for regulation and machine-learning models for disease prediction. The networks include a full gene regulatory network (GRN) for the brain, linking transcription factors, enhancers, and target genes from merging of the QTLs, generalized element-activity correlations, and Hi-C data. By using this network, we link disease genes to genome-wide association study (GWAS) variants for psychiatric disorders. For schizophrenia, we linked 321 genes to the 142 reported GWAS loci. We then embedded the regulatory network into a deep-learning model to predict psychiatric phenotypes from genotype and expression. Our model gives a ~6-fold improvement in prediction over additive polygenic risk scores. Moreover, it achieves a ~3-fold improvement over additive models, even when the gene expression data are imputed, highlighting the value of having just a small amount of transcriptome data for disease prediction. Lastly, it highlights key genes and pathways associated with disorder prediction, including immunological, synaptic, and metabolic pathways, recapitulating de novo results from more targeted analyses. CONCLUSION Our resource and integrative analyses have uncovered genomic elements and networks in the brain, which in turn have provided insight into the molecular mechanisms underlying psychiatric disorders. Our deep-learning model improves disease risk prediction over traditional approaches and can be extended with additional data types (e.g., microRNA and neuroimaging). A comprehensive functional genomic resource for the adult human brain. The resource forms a three-layer pyramid. The bottom layer includes sequencing datasets for traits, such as schizophrenia. The middle layer represents derived datasets, including functional genomic elements and QTLs. The top layer contains integrated models, which link genotypes to phenotypes. DSPN, Deep Structured Phenotype Network; PC1 and PC2, principal components 1 and 2; ref, reference; alt, alternate; H3K27ac, histone H3 acetylation at lysine 27.
- Subjects :
- 0301 basic medicine
Epigenomics
Enhancer Elements
General Science & Technology
1.1 Normal biological development and functioning
Quantitative Trait Loci
Gene regulatory network
Datasets as Topic
Genome-wide association study
Computational biology
PsychENCODE Consortium
Quantitative trait locus
Biology
Genome
Article
Epigenesis, Genetic
03 medical and health sciences
Deep Learning
Genetic
Underpinning research
Genetics
Humans
2.1 Biological and endogenous factors
Gene Regulatory Networks
Aetiology
Gene
Epigenesis
Regulation of gene expression
Multidisciplinary
Mental Disorders
Human Genome
Neurosciences
Brain
Brain Disorders
Enhancer Elements, Genetic
030104 developmental biology
Mental Health
Gene Expression Regulation
Schizophrenia
Single-Cell Analysis
Transcriptome
Genome-Wide Association Study
Biotechnology
Subjects
Details
- Database :
- OpenAIRE
- Journal :
- Science (New York, N.Y.), vol 362, iss 6420
- Accession number :
- edsair.doi.dedup.....fc771f4f7a09eb8b359d15c2a0d098cd