Back to Search Start Over

Comprehensive functional genomic resource and integrative model for the human brain

Authors :
Mette A. Peters
Prashant Emani
Kiran Girdhar
Gabriel E. Hoffman
Michael J. Gandal
Yan Jiang
Min Xu
Declan Clarke
Aparna Nathan
Shuang Liu
Schahram Akbarian
Jill Moore
Jonathan J. Park
Selim Kalayci
Chengfei Yan
Hyejung Won
Mark Gerstein
Shaoke Lou
Zhiping Weng
Holly Zhou
Eugenio Mattei
Daifeng Wang
Zeynep H. Gümüş
Tonya M. Brunetti
Gregory E. Crawford
Xu Shi
Daniel H. Geschwind
James A. Knowles
Kasidet Manakongtreecheep
Dominic Fitzgerald
Andrew E. Jaffe
Fabio C. P. Navarro
Nenad Sestan
Yucheng T. Yang
Kevin P. White
Jing Zhang
Jonathan Warrell
Suhn K. Rhie
Panos Roussos
Mengting Gu
Source :
Science (New York, N.Y.), vol 362, iss 6420
Publication Year :
2018
Publisher :
eScholarship, University of California, 2018.

Abstract

INTRODUCTION Strong genetic associations have been found for a number of psychiatric disorders. However, understanding the underlying molecular mechanisms remains challenging. RATIONALE To address this challenge, the PsychENCODE Consortium has developed a comprehensive online resource and integrative models for the functional genomics of the human brain. RESULTS The base of the pyramidal resource is the datasets generated by PsychENCODE, including bulk transcriptome, chromatin, genotype, and Hi-C datasets and single-cell transcriptomic data from ~32,000 cells for major brain regions. We have merged these with data from Genotype-Tissue Expression (GTEx), ENCODE, Roadmap Epigenomics, and single-cell analyses. Via uniform processing, we created a harmonized resource, allowing us to survey functional genomics data on the brain over a sample size of 1866 individuals. From this uniformly processed dataset, we created derived data products. These include lists of brain-expressed genes, coexpression modules, and single-cell expression profiles for many brain cell types; ~79,000 brain-active enhancers with associated Hi-C loops and topologically associating domains; and ~2.5 million expression quantitative-trait loci (QTLs) comprising ~238,000 linkage-disequilibrium–independent single-nucleotide polymorphisms and of other types of QTLs associated with splice isoforms, cell fractions, and chromatin activity. By using these, we found that >88% of the cross-population variation in brain gene expression can be accounted for by cell fraction changes. Furthermore, a number of disorders and aging are associated with changes in cell-type proportions. The derived data also enable comparison between the brain and other tissues. In particular, by using spectral analyses, we found that the brain has distinct expression and epigenetic patterns, including a greater extent of noncoding transcription than other tissues. The top level of the resource consists of integrative networks for regulation and machine-learning models for disease prediction. The networks include a full gene regulatory network (GRN) for the brain, linking transcription factors, enhancers, and target genes from merging of the QTLs, generalized element-activity correlations, and Hi-C data. By using this network, we link disease genes to genome-wide association study (GWAS) variants for psychiatric disorders. For schizophrenia, we linked 321 genes to the 142 reported GWAS loci. We then embedded the regulatory network into a deep-learning model to predict psychiatric phenotypes from genotype and expression. Our model gives a ~6-fold improvement in prediction over additive polygenic risk scores. Moreover, it achieves a ~3-fold improvement over additive models, even when the gene expression data are imputed, highlighting the value of having just a small amount of transcriptome data for disease prediction. Lastly, it highlights key genes and pathways associated with disorder prediction, including immunological, synaptic, and metabolic pathways, recapitulating de novo results from more targeted analyses. CONCLUSION Our resource and integrative analyses have uncovered genomic elements and networks in the brain, which in turn have provided insight into the molecular mechanisms underlying psychiatric disorders. Our deep-learning model improves disease risk prediction over traditional approaches and can be extended with additional data types (e.g., microRNA and neuroimaging). A comprehensive functional genomic resource for the adult human brain. The resource forms a three-layer pyramid. The bottom layer includes sequencing datasets for traits, such as schizophrenia. The middle layer represents derived datasets, including functional genomic elements and QTLs. The top layer contains integrated models, which link genotypes to phenotypes. DSPN, Deep Structured Phenotype Network; PC1 and PC2, principal components 1 and 2; ref, reference; alt, alternate; H3K27ac, histone H3 acetylation at lysine 27.

Details

Database :
OpenAIRE
Journal :
Science (New York, N.Y.), vol 362, iss 6420
Accession number :
edsair.doi.dedup.....fc771f4f7a09eb8b359d15c2a0d098cd