864 results on '"Gerstein, Mark B"'
Search Results
2. REliable PIcking by Consensus (REPIC): a consensus methodology for harnessing multiple cryo-EM particle pickers
- Author
-
Cameron, Christopher J. F., Seager, Sebastian J. H., Sigworth, Fred J., Tagare, Hemant D., and Gerstein, Mark B.
- Published
- 2024
- Full Text
- View/download PDF
3. Improved prediction of ligand-protein binding affinities by meta-modeling
- Author
-
Lee, Ho-Joon, Emani, Prashant S., and Gerstein, Mark B.
- Subjects
Computer Science - Machine Learning ,Quantitative Biology - Quantitative Methods - Abstract
The accurate screening of candidate drug ligands against target proteins through computational approaches is of prime interest to drug development efforts. Such virtual screening depends in part on methods to predict the binding affinity between ligands and proteins. Many computational models for binding affinity prediction have been developed, but with varying results across targets. Given that ensembling or meta-modeling approaches have shown great promise in reducing model-specific biases, we develop a framework to integrate published force-field-based empirical docking and sequence-based deep learning models. In building this framework, we evaluate many combinations of individual base models, training databases, and several meta-modeling approaches. We show that many of our meta-models significantly improve affinity predictions over base models. Our best meta-models achieve comparable performance to state-of-the-art deep learning tools exclusively based on 3D structures, while allowing for improved database scalability and flexibility through the explicit inclusion of features such as physicochemical properties or molecular descriptors. We further demonstrate improved generalization capability by our models using a large-scale benchmark of affinity prediction as well as a virtual screening application benchmark. Overall, we demonstrate that diverse modeling approaches can be ensembled together to gain meaningful improvement in binding affinity prediction., Comment: 54 pages, 6 main tables, 6 main figures, 8 supplementary figures, and supporting information. For 11 supplementary tables and code, see https://github.com/Lee1701/Lee2023a
- Published
- 2023
4. exRNA-eCLIP intersection analysis reveals a map of extracellular RNA binding proteins and associated RNAs across major human biofluids and carriers
- Author
-
LaPlante, Emily L, Stürchler, Alessandra, Fullem, Robert, Chen, David, Starner, Anne C, Esquivel, Emmanuel, Alsop, Eric, Jackson, Andrew R, Ghiran, Ionita, Pereira, Getulio, Rozowsky, Joel, Chang, Justin, Gerstein, Mark B, Alexander, Roger P, Roth, Matthew E, Franklin, Jeffrey L, Coffey, Robert J, Raffai, Robert L, Mansuy, Isabelle M, Stavrakis, Stavros, deMello, Andrew J, Laurent, Louise C, Wang, Yi-Ting, Tsai, Chia-Feng, Liu, Tao, Jones, Jennifer, Van Keuren-Jensen, Kendall, Van Nostrand, Eric, Mateescu, Bogdan, and Milosavljevic, Aleksandar
- Subjects
Biological Sciences ,Bioinformatics and Computational Biology ,Genetics ,Human Genome ,Biotechnology ,Underpinning research ,1.1 Normal biological development and functioning ,Generic health relevance ,NIH ERCC ,RNA binding proteins ,RNA footprint correlation ,cell-free RNAs ,cell-free biomarkers ,eCLIP ,exRNA carriers ,human biofluids ,liquid biopsies ,public resource - Abstract
Although the role of RNA binding proteins (RBPs) in extracellular RNA (exRNA) biology is well established, their exRNA cargo and distribution across biofluids are largely unknown. To address this gap, we extend the exRNA Atlas resource by mapping exRNAs carried by extracellular RBPs (exRBPs). This map was developed through an integrative analysis of ENCODE enhanced crosslinking and immunoprecipitation (eCLIP) data (150 RBPs) and human exRNA profiles (6,930 samples). Computational analysis and experimental validation identified exRBPs in plasma, serum, saliva, urine, cerebrospinal fluid, and cell-culture-conditioned medium. exRBPs carry exRNA transcripts from small non-coding RNA biotypes, including microRNA (miRNA), piRNA, tRNA, small nuclear RNA (snRNA), small nucleolar RNA (snoRNA), Y RNA, and lncRNA, as well as protein-coding mRNA fragments. Computational deconvolution of exRBP RNA cargo reveals associations of exRBPs with extracellular vesicles, lipoproteins, and ribonucleoproteins across human biofluids. Overall, we mapped the distribution of exRBPs across human biofluids, presenting a resource for the community.
- Published
- 2023
5. Phase 2 of extracellular RNA communication consortium charts next-generation approaches for extracellular RNA research
- Author
-
Mateescu, Bogdan, Jones, Jennifer C, Alexander, Roger P, Alsop, Eric, An, Ji Yeong, Asghari, Mohammad, Boomgarden, Alex, Bouchareychas, Laura, Cayota, Alfonso, Chang, Hsueh-Chia, Charest, Al, Chiu, Daniel T, Coffey, Robert J, Das, Saumya, De Hoff, Peter, deMello, Andrew, D’Souza-Schorey, Crislyn, Elashoff, David, Eliato, Kiarash R, Franklin, Jeffrey L, Galas, David J, Gerstein, Mark B, Ghiran, Ionita H, Go, David B, Gould, Stephen, Grogan, Tristan R, Higginbotham, James N, Hladik, Florian, Huang, Tony Jun, Huo, Xiaoye, Hutchins, Elizabeth, Jeppesen, Dennis K, Jovanovic-Talisman, Tijana, Kim, Betty YS, Kim, Sung, Kim, Kyoung-Mee, Kim, Yong, Kitchen, Robert R, Knouse, Vaughan, LaPlante, Emily L, Lebrilla, Carlito B, Lee, L James, Lennon, Kathleen M, Li, Guoping, Li, Feng, Li, Tieyi, Liu, Tao, Liu, Zirui, Maddox, Adam L, McCarthy, Kyle, Meechoovet, Bessie, Maniya, Nalin, Meng, Yingchao, Milosavljevic, Aleksandar, Min, Byoung-Hoon, Morey, Amber, Ng, Martin, Nolan, John, De Oliveira, Getulio P, Paulaitis, Michael E, Phu, Tuan Anh, Raffai, Robert L, Reátegui, Eduardo, Roth, Matthew E, Routenberg, David A, Rozowsky, Joel, Rufo, Joseph, Senapati, Satyajyoti, Shachar, Sigal, Sharma, Himani, Sood, Anil K, Stavrakis, Stavros, Stürchler, Alessandra, Tewari, Muneesh, Tosar, Juan P, Tucker-Schwartz, Alexander K, Turchinovich, Andrey, Valkov, Nedyalka, Van Keuren-Jensen, Kendall, Vickers, Kasey C, Vojtech, Lucia, Vreeland, Wyatt N, Wang, Ceming, Wang, Kai, Wang, ZeYu, Welsh, Joshua A, Witwer, Kenneth W, Wong, David TW, Xia, Jianping, Xie, Ya-Hong, Yang, Kaichun, Zaborowski, Mikołaj P, Zhang, Chenguang, Zhang, Qin, Zivkovic, Angela M, and Laurent, Louise C
- Subjects
Biological Sciences ,Biomedical and Clinical Sciences ,Genetics ,Biochemistry ,Biological sciences ,Cell biology ,Molecular biology - Abstract
The extracellular RNA communication consortium (ERCC) is an NIH-funded program aiming to promote the development of new technologies, resources, and knowledge about exRNAs and their carriers. After Phase 1 (2013-2018), Phase 2 of the program (ERCC2, 2019-2023) aims to fill critical gaps in knowledge and technology to enable rigorous and reproducible methods for separation and characterization of both bulk populations of exRNA carriers and single EVs. ERCC2 investigators are also developing new bioinformatic pipelines to promote data integration through the exRNA atlas database. ERCC2 has established several Working Groups (Resource Sharing, Reagent Development, Data Analysis and Coordination, Technology Development, nomenclature, and Scientific Outreach) to promote collaboration between ERCC2 members and the broader scientific community. We expect that ERCC2's current and future achievements will significantly improve our understanding of exRNA biology and the development of accurate and efficient exRNA-based diagnostic, prognostic, and theranostic biomarker assays.
- Published
- 2022
6. Author Correction: Perspectives on ENCODE
- Author
-
Snyder, Michael P, Gingeras, Thomas R, Moore, Jill E, Weng, Zhiping, Gerstein, Mark B, Ren, Bing, Hardison, Ross C, Stamatoyannopoulos, John A, Graveley, Brenton R, Feingold, Elise A, Pazin, Michael J, Pagan, Michael, Gilchrist, Daniel A, Hitz, Benjamin C, Cherry, J Michael, Bernstein, Bradley E, Mendenhall, Eric M, Zerbino, Daniel R, Frankish, Adam, Flicek, Paul, and Myers, Richard M
- Subjects
ENCODE Project Consortium ,General Science & Technology - Abstract
In this Article, the authors Rizi Ai (Department of Chemistry and Biochemistry, University of California, San Diego, La Jolla, CA, USA) and Shantao Li (Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT, USA) were mistakenly omitted from the ENCODE Project Consortium author list. The original Article has been corrected online.
- Published
- 2022
7. Author Correction: Expanded encyclopaedias of DNA elements in the human and mouse genomes
- Author
-
Moore, Jill E, Purcaro, Michael J, Pratt, Henry E, Epstein, Charles B, Shoresh, Noam, Adrian, Jessika, Kawli, Trupti, Davis, Carrie A, Dobin, Alexander, Kaul, Rajinder, Halow, Jessica, Van Nostrand, Eric L, Freese, Peter, Gorkin, David U, Shen, Yin, He, Yupeng, Mackiewicz, Mark, Pauli-Behn, Florencia, Williams, Brian A, Mortazavi, Ali, Keller, Cheryl A, Zhang, Xiao-Ou, Elhajjajy, Shaimae I, Huey, Jack, Dickel, Diane E, Snetkova, Valentina, Wei, Xintao, Wang, Xiaofeng, Rivera-Mulia, Juan Carlos, Rozowsky, Joel, Zhang, Jing, Chhetri, Surya B, Zhang, Jialing, Victorsen, Alec, White, Kevin P, Visel, Axel, Yeo, Gene W, Burge, Christopher B, Lécuyer, Eric, Gilbert, David M, Dekker, Job, Rinn, John, Mendenhall, Eric M, Ecker, Joseph R, Kellis, Manolis, Klein, Robert J, Noble, William S, Kundaje, Anshul, Guigó, Roderic, Farnham, Peggy J, Cherry, J Michael, Myers, Richard M, Ren, Bing, Graveley, Brenton R, Gerstein, Mark B, Pennacchio, Len A, Snyder, Michael P, Bernstein, Bradley E, Wold, Barbara, Hardison, Ross C, Gingeras, Thomas R, Stamatoyannopoulos, John A, and Weng, Zhiping
- Subjects
ENCODE Project Consortium ,General Science & Technology - Abstract
In the version of this article initially published, two members of the ENCODE Project Consortium were missing from the author list. Rizi Ai (Department of Chemistry and Biochemistry, University of California, San Diego, La Jolla, CA, USA) and Shantao Li (Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT, USA) are now included in the author list. These errors have been corrected in the online version of the article.
- Published
- 2022
8. Massively parallel reporter assay investigates shared genetic variants of eight psychiatric disorders
- Author
-
Lee, Sool, McAfee, Jessica C., Lee, Jiseok, Gomez, Alejandro, Ledford, Austin T., Clarke, Declan, Min, Hyunggyu, Gerstein, Mark B., Boyle, Alan P., Sullivan, Patrick F., Beltran, Adriana S., and Won, Hyejung
- Published
- 2024
- Full Text
- View/download PDF
9. Quantum Computing at the Frontiers of Biological Sciences
- Author
-
Emani, Prashant S., Warrell, Jonathan, Anticevic, Alan, Bekiranov, Stefan, Gandal, Michael, McConnell, Michael J., Sapiro, Guillermo, Aspuru-Guzik, Alán, Baker, Justin, Bastiani, Matteo, McClure, Patrick, Murray, John, Sotiropoulos, Stamatios N, Taylor, Jacob, Senthil, Geetha, Lehner, Thomas, Gerstein, Mark B., and Harrow, Aram W.
- Subjects
Quantum Physics ,Quantitative Biology - Genomics ,Quantitative Biology - Neurons and Cognition ,Quantitative Biology - Quantitative Methods - Abstract
The search for meaningful structure in biological data has relied on cutting-edge advances in computational technology and data science methods. However, challenges arise as we push the limits of scale and complexity in biological problems. Innovation in massively parallel, classical computing hardware and algorithms continues to address many of these challenges, but there is a need to simultaneously consider new paradigms to circumvent current barriers to processing speed. Accordingly, we articulate a view towards quantum computation and quantum information science, where algorithms have demonstrated potential polynomial and exponential computational speedups in certain applications, such as machine learning. The maturation of the field of quantum computing, in hardware and algorithm development, also coincides with the growth of several collaborative efforts to address questions across length and time scales, and scientific disciplines. We use this coincidence to explore the potential for quantum computing to aid in one such endeavor: the merging of insights from genetics, genomics, neuroimaging and behavioral phenotyping. By examining joint opportunities for computational innovation across fields, we highlight the need for a common language between biological data analysis and quantum computing. Ultimately, we consider current and future prospects for the employment of quantum computing algorithms in the biological sciences., Comment: 22 pages, 3 figures, Perspective
- Published
- 2019
- Full Text
- View/download PDF
10. Author Correction: Retrospective evaluation of whole exome and genome mutation calls in 746 cancer samples.
- Author
-
Bailey, Matthew H, Meyerson, William U, Dursi, Lewis Jonathan, Wang, Liang-Bo, Dong, Guanlan, Liang, Wen-Wei, Weerasinghe, Amila, Li, Shantao, Li, Yize, Kelso, Sean, MC3 Working Group, PCAWG novel somatic mutation calling methods working group, Saksena, Gordon, Ellrott, Kyle, Wendl, Michael C, Wheeler, David A, Getz, Gad, Simpson, Jared T, Gerstein, Mark B, Ding, Li, and PCAWG Consortium
- Subjects
MC3 Working Group ,PCAWG novel somatic mutation calling methods working group ,PCAWG Consortium - Abstract
Correction to this paper has been published: https://doi.org/10.1038/s41467-020-20128-w.
- Published
- 2020
11. Retrospective evaluation of whole exome and genome mutation calls in 746 cancer samples.
- Author
-
Bailey, Matthew H, Meyerson, William U, Dursi, Lewis Jonathan, Wang, Liang-Bo, Dong, Guanlan, Liang, Wen-Wei, Weerasinghe, Amila, Li, Shantao, Li, Yize, Kelso, Sean, MC3 Working Group, PCAWG novel somatic mutation calling methods working group, Saksena, Gordon, Ellrott, Kyle, Wendl, Michael C, Wheeler, David A, Getz, Gad, Simpson, Jared T, Gerstein, Mark B, Ding, Li, and PCAWG Consortium
- Subjects
MC3 Working Group ,PCAWG novel somatic mutation calling methods working group ,PCAWG Consortium ,Humans ,Neoplasms ,DNA ,Intergenic ,Retrospective Studies ,Base Composition ,Mutation ,Genome ,Human ,Exons ,Databases ,Genetic ,Exome ,Whole Genome Sequencing ,Whole Exome Sequencing ,Cancer ,Biotechnology ,Genetics ,Human Genome - Abstract
The Cancer Genome Atlas (TCGA) and International Cancer Genome Consortium (ICGC) curated consensus somatic mutation calls using whole exome sequencing (WES) and whole genome sequencing (WGS), respectively. Here, as part of the ICGC/TCGA Pan-Cancer Analysis of Whole Genomes (PCAWG) Consortium, which aggregated whole genome sequencing data from 2,658 cancers across 38 tumour types, we compare WES and WGS side-by-side from 746 TCGA samples, finding that ~80% of mutations overlap in covered exonic regions. We estimate that low variant allele fraction (VAF
- Published
- 2020
12. Perspectives on ENCODE
- Author
-
Snyder, Michael P, Gingeras, Thomas R, Moore, Jill E, Weng, Zhiping, Gerstein, Mark B, Ren, Bing, Hardison, Ross C, Stamatoyannopoulos, John A, Graveley, Brenton R, Feingold, Elise A, Pazin, Michael J, Pagan, Michael, Gilchrist, Daniel A, Hitz, Benjamin C, Cherry, J Michael, Bernstein, Bradley E, Mendenhall, Eric M, Zerbino, Daniel R, Frankish, Adam, Flicek, Paul, and Myers, Richard M
- Subjects
Biological Sciences ,Bioinformatics and Computational Biology ,Genetics ,Human Genome ,Biotechnology ,1.1 Normal biological development and functioning ,Animals ,Binding Sites ,Chromatin ,DNA Methylation ,Databases ,Genetic ,Gene Expression Regulation ,Genome ,Genome ,Human ,Genomics ,Histones ,Humans ,Mice ,Molecular Sequence Annotation ,Quality Control ,Regulatory Sequences ,Nucleic Acid ,Transcription Factors ,ENCODE Project Consortium ,General Science & Technology - Abstract
The Encylopedia of DNA Elements (ENCODE) Project launched in 2003 with the long-term goal of developing a comprehensive map of functional elements in the human genome. These included genes, biochemical regions associated with gene regulation (for example, transcription factor binding sites, open chromatin, and histone marks) and transcript isoforms. The marks serve as sites for candidate cis-regulatory elements (cCREs) that may serve functional roles in regulating gene expression1. The project has been extended to model organisms, particularly the mouse. In the third phase of ENCODE, nearly a million and more than 300,000 cCRE annotations have been generated for human and mouse, respectively, and these have provided a valuable resource for the scientific community.
- Published
- 2020
13. Expanded encyclopaedias of DNA elements in the human and mouse genomes
- Author
-
Moore, Jill E, Purcaro, Michael J, Pratt, Henry E, Epstein, Charles B, Shoresh, Noam, Adrian, Jessika, Kawli, Trupti, Davis, Carrie A, Dobin, Alexander, Kaul, Rajinder, Halow, Jessica, Van Nostrand, Eric L, Freese, Peter, Gorkin, David U, Shen, Yin, He, Yupeng, Mackiewicz, Mark, Pauli-Behn, Florencia, Williams, Brian A, Mortazavi, Ali, Keller, Cheryl A, Zhang, Xiao-Ou, Elhajjajy, Shaimae I, Huey, Jack, Dickel, Diane E, Snetkova, Valentina, Wei, Xintao, Wang, Xiaofeng, Rivera-Mulia, Juan Carlos, Rozowsky, Joel, Zhang, Jing, Chhetri, Surya B, Zhang, Jialing, Victorsen, Alec, White, Kevin P, Visel, Axel, Yeo, Gene W, Burge, Christopher B, Lécuyer, Eric, Gilbert, David M, Dekker, Job, Rinn, John, Mendenhall, Eric M, Ecker, Joseph R, Kellis, Manolis, Klein, Robert J, Noble, William S, Kundaje, Anshul, Guigó, Roderic, Farnham, Peggy J, Cherry, J Michael, Myers, Richard M, Ren, Bing, Graveley, Brenton R, Gerstein, Mark B, Pennacchio, Len A, Snyder, Michael P, Bernstein, Bradley E, Wold, Barbara, Hardison, Ross C, Gingeras, Thomas R, Stamatoyannopoulos, John A, and Weng, Zhiping
- Subjects
Biological Sciences ,Bioinformatics and Computational Biology ,Genetics ,Human Genome ,1.1 Normal biological development and functioning ,Animals ,Chromatin ,DNA ,DNA Footprinting ,DNA Methylation ,DNA Replication Timing ,Databases ,Genetic ,Deoxyribonuclease I ,Genome ,Genome ,Human ,Genomics ,Histones ,Humans ,Mice ,Mice ,Transgenic ,Molecular Sequence Annotation ,RNA-Binding Proteins ,Registries ,Regulatory Sequences ,Nucleic Acid ,Transcription ,Genetic ,Transposases ,ENCODE Project Consortium ,General Science & Technology - Abstract
The human and mouse genomes contain instructions that specify RNAs and proteins and govern the timing, magnitude, and cellular context of their production. To better delineate these elements, phase III of the Encyclopedia of DNA Elements (ENCODE) Project has expanded analysis of the cell and tissue repertoires of RNA transcription, chromatin structure and modification, DNA methylation, chromatin looping, and occupancy by transcription factors and RNA-binding proteins. Here we summarize these efforts, which have produced 5,992 new experimental datasets, including systematic determinations across mouse fetal development. All data are available through the ENCODE data portal (https://www.encodeproject.org), including phase II ENCODE1 and Roadmap Epigenomics2 data. We have developed a registry of 926,535 human and 339,815 mouse candidate cis-regulatory elements, covering 7.9 and 3.4% of their respective genomes, by integrating selected datatypes associated with gene regulation, and constructed a web-based server (SCREEN; http://screen.encodeproject.org) to provide flexible, user-defined access to this resource. Collectively, the ENCODE data and registry provide an expansive resource for the scientific community to build a better understanding of the organization and function of the human and mouse genomes.
- Published
- 2020
14. Expanded encyclopaedias of DNA elements in the human and mouse genomes.
- Author
-
ENCODE Project Consortium, Moore, Jill E, Purcaro, Michael J, Pratt, Henry E, Epstein, Charles B, Shoresh, Noam, Adrian, Jessika, Kawli, Trupti, Davis, Carrie A, Dobin, Alexander, Kaul, Rajinder, Halow, Jessica, Van Nostrand, Eric L, Freese, Peter, Gorkin, David U, Shen, Yin, He, Yupeng, Mackiewicz, Mark, Pauli-Behn, Florencia, Williams, Brian A, Mortazavi, Ali, Keller, Cheryl A, Zhang, Xiao-Ou, Elhajjajy, Shaimae I, Huey, Jack, Dickel, Diane E, Snetkova, Valentina, Wei, Xintao, Wang, Xiaofeng, Rivera-Mulia, Juan Carlos, Rozowsky, Joel, Zhang, Jing, Chhetri, Surya B, Zhang, Jialing, Victorsen, Alec, White, Kevin P, Visel, Axel, Yeo, Gene W, Burge, Christopher B, Lécuyer, Eric, Gilbert, David M, Dekker, Job, Rinn, John, Mendenhall, Eric M, Ecker, Joseph R, Kellis, Manolis, Klein, Robert J, Noble, William S, Kundaje, Anshul, Guigó, Roderic, Farnham, Peggy J, Cherry, J Michael, Myers, Richard M, Ren, Bing, Graveley, Brenton R, Gerstein, Mark B, Pennacchio, Len A, Snyder, Michael P, Bernstein, Bradley E, Wold, Barbara, Hardison, Ross C, Gingeras, Thomas R, Stamatoyannopoulos, John A, and Weng, Zhiping
- Subjects
ENCODE Project Consortium ,Chromatin ,Animals ,Mice ,Transgenic ,Humans ,Mice ,Deoxyribonuclease I ,Transposases ,RNA-Binding Proteins ,Histones ,DNA ,Registries ,DNA Footprinting ,Genomics ,DNA Methylation ,DNA Replication Timing ,Transcription ,Genetic ,Regulatory Sequences ,Nucleic Acid ,Genome ,Genome ,Human ,Databases ,Genetic ,Molecular Sequence Annotation ,Human Genome ,HIV/AIDS ,Vaccine Related ,Biotechnology ,Genetics ,Immunization ,Vaccine Related (AIDS) ,Prevention ,1.1 Normal biological development and functioning ,Generic health relevance ,General Science & Technology - Abstract
The human and mouse genomes contain instructions that specify RNAs and proteins and govern the timing, magnitude, and cellular context of their production. To better delineate these elements, phase III of the Encyclopedia of DNA Elements (ENCODE) Project has expanded analysis of the cell and tissue repertoires of RNA transcription, chromatin structure and modification, DNA methylation, chromatin looping, and occupancy by transcription factors and RNA-binding proteins. Here we summarize these efforts, which have produced 5,992 new experimental datasets, including systematic determinations across mouse fetal development. All data are available through the ENCODE data portal (https://www.encodeproject.org), including phase II ENCODE1 and Roadmap Epigenomics2 data. We have developed a registry of 926,535 human and 339,815 mouse candidate cis-regulatory elements, covering 7.9 and 3.4% of their respective genomes, by integrating selected datatypes associated with gene regulation, and constructed a web-based server (SCREEN; http://screen.encodeproject.org) to provide flexible, user-defined access to this resource. Collectively, the ENCODE data and registry provide an expansive resource for the scientific community to build a better understanding of the organization and function of the human and mouse genomes.
- Published
- 2020
15. Perspectives on ENCODE.
- Author
-
ENCODE Project Consortium, Snyder, Michael P, Gingeras, Thomas R, Moore, Jill E, Weng, Zhiping, Gerstein, Mark B, Ren, Bing, Hardison, Ross C, Stamatoyannopoulos, John A, Graveley, Brenton R, Feingold, Elise A, Pazin, Michael J, Pagan, Michael, Gilchrist, Daniel A, Hitz, Benjamin C, Cherry, J Michael, Bernstein, Bradley E, Mendenhall, Eric M, Zerbino, Daniel R, Frankish, Adam, Flicek, Paul, and Myers, Richard M
- Subjects
ENCODE Project Consortium ,Chromatin ,Animals ,Humans ,Mice ,Histones ,Transcription Factors ,Genomics ,DNA Methylation ,Gene Expression Regulation ,Binding Sites ,Regulatory Sequences ,Nucleic Acid ,Genome ,Genome ,Human ,Quality Control ,Databases ,Genetic ,Molecular Sequence Annotation ,Human Genome ,Vaccine Related ,Biotechnology ,Genetics ,Immunization ,Vaccine Related (AIDS) ,Prevention ,1.1 Normal biological development and functioning ,Generic health relevance ,General Science & Technology - Abstract
The Encylopedia of DNA Elements (ENCODE) Project launched in 2003 with the long-term goal of developing a comprehensive map of functional elements in the human genome. These included genes, biochemical regions associated with gene regulation (for example, transcription factor binding sites, open chromatin, and histone marks) and transcript isoforms. The marks serve as sites for candidate cis-regulatory elements (cCREs) that may serve functional roles in regulating gene expression1. The project has been extended to model organisms, particularly the mouse. In the third phase of ENCODE, nearly a million and more than 300,000 cCRE annotations have been generated for human and mouse, respectively, and these have provided a valuable resource for the scientific community.
- Published
- 2020
16. The association between evening social media use and delayed sleep may be causal: Suggestive evidence from 120 million Reddit timestamps
- Author
-
Meyerson, William U., Fineberg, Sarah K., Andrade, Fernanda C., Corlett, Philip, Gerstein, Mark B., and Hoyle, Rick H.
- Published
- 2023
- Full Text
- View/download PDF
17. A Single-Cell Transcriptomic Atlas of Human Neocortical Development during Mid-gestation
- Author
-
Polioudakis, Damon, de la Torre-Ubieta, Luis, Langerman, Justin, Elkins, Andrew G, Shi, Xu, Stein, Jason L, Vuong, Celine K, Nichterwitz, Susanne, Gevorgian, Melinda, Opland, Carli K, Lu, Daning, Connell, William, Ruzzo, Elizabeth K, Lowe, Jennifer K, Hadzic, Tarik, Hinz, Flora I, Sabri, Shan, Lowry, William E, Gerstein, Mark B, Plath, Kathrin, and Geschwind, Daniel H
- Subjects
Biological Psychology ,Biomedical and Clinical Sciences ,Neurosciences ,Psychology ,Genetics ,Brain Disorders ,Mental Health ,Stem Cell Research ,Stem Cell Research - Nonembryonic - Human ,Human Genome ,1.1 Normal biological development and functioning ,2.1 Biological and endogenous factors ,Autism Spectrum Disorder ,Cell Cycle ,Cerebral Cortex ,Databases ,Genetic ,Ependymoglial Cells ,Epilepsy ,Female ,Gene Expression Profiling ,Gene Expression Regulation ,Developmental ,Gene Regulatory Networks ,Gestational Age ,Humans ,Intellectual Disability ,Interneurons ,Neocortex ,Neural Stem Cells ,Neurogenesis ,Neurons ,Pregnancy ,Pregnancy Trimester ,Second ,RNA-Seq ,Single-Cell Analysis ,Telophase ,autism ,cortical development ,differentiation ,epilepsy ,evolution ,human ,intellectual disability ,neurogenesis ,schizophrenia ,subplate ,Cognitive Sciences ,Neurology & Neurosurgery ,Biological psychology - Abstract
We performed RNA sequencing on 40,000 cells to create a high-resolution single-cell gene expression atlas of developing human cortex, providing the first single-cell characterization of previously uncharacterized cell types, including human subplate neurons, comparisons with bulk tissue, and systematic analyses of technical factors. These data permit deconvolution of regulatory networks connecting regulatory elements and transcriptional drivers to single-cell gene expression programs, significantly extending our understanding of human neurogenesis, cortical evolution, and the cellular basis of neuropsychiatric disease. We tie cell-cycle progression with early cell fate decisions during neurogenesis, demonstrating that differentiation occurs on a transcriptomic continuum; rather than only expressing a few transcription factors that drive cell fates, differentiating cells express broad, mixed cell-type transcriptomes before telophase. By mapping neuropsychiatric disease genes to cell types, we implicate dysregulation of specific cell types in ASD, ID, and epilepsy. We developed CoDEx, an online portal to facilitate data access and browsing.
- Published
- 2019
18. Multi-platform discovery of haplotype-resolved structural variation in human genomes.
- Author
-
Chaisson, Mark JP, Sanders, Ashley D, Zhao, Xuefang, Malhotra, Ankit, Porubsky, David, Rausch, Tobias, Gardner, Eugene J, Rodriguez, Oscar L, Guo, Li, Collins, Ryan L, Fan, Xian, Wen, Jia, Handsaker, Robert E, Fairley, Susan, Kronenberg, Zev N, Kong, Xiangmeng, Hormozdiari, Fereydoun, Lee, Dillon, Wenger, Aaron M, Hastie, Alex R, Antaki, Danny, Anantharaman, Thomas, Audano, Peter A, Brand, Harrison, Cantsilieris, Stuart, Cao, Han, Cerveira, Eliza, Chen, Chong, Chen, Xintong, Chin, Chen-Shan, Chong, Zechen, Chuang, Nelson T, Lambert, Christine C, Church, Deanna M, Clarke, Laura, Farrell, Andrew, Flores, Joey, Galeev, Timur, Gorkin, David U, Gujral, Madhusudan, Guryev, Victor, Heaton, William Haynes, Korlach, Jonas, Kumar, Sushant, Kwon, Jee Young, Lam, Ernest T, Lee, Jong Eun, Lee, Joyce, Lee, Wan-Ping, Lee, Sau Peng, Li, Shantao, Marks, Patrick, Viaud-Martinez, Karine, Meiers, Sascha, Munson, Katherine M, Navarro, Fabio CP, Nelson, Bradley J, Nodzak, Conor, Noor, Amina, Kyriazopoulou-Panagiotopoulou, Sofia, Pang, Andy WC, Qiu, Yunjiang, Rosanio, Gabriel, Ryan, Mallory, Stütz, Adrian, Spierings, Diana CJ, Ward, Alistair, Welch, AnneMarie E, Xiao, Ming, Xu, Wei, Zhang, Chengsheng, Zhu, Qihui, Zheng-Bradley, Xiangqun, Lowy, Ernesto, Yakneen, Sergei, McCarroll, Steven, Jun, Goo, Ding, Li, Koh, Chong Lek, Ren, Bing, Flicek, Paul, Chen, Ken, Gerstein, Mark B, Kwok, Pui-Yan, Lansdorp, Peter M, Marth, Gabor T, Sebat, Jonathan, Shi, Xinghua, Bashir, Ali, Ye, Kai, Devine, Scott E, Talkowski, Michael E, Mills, Ryan E, Marschall, Tobias, Korbel, Jan O, Eichler, Evan E, and Lee, Charles
- Subjects
Humans ,Chromosome Mapping ,Genomics ,Haplotypes ,Genome ,Human ,Algorithms ,Databases ,Genetic ,INDEL Mutation ,Genomic Structural Variation ,High-Throughput Nucleotide Sequencing ,Whole Genome Sequencing ,Genome ,Human ,Databases ,Genetic - Abstract
The incomplete identification of structural variants (SVs) from whole-genome sequencing data limits studies of human genetic diversity and disease association. Here, we apply a suite of long-read, short-read, strand-specific sequencing technologies, optical mapping, and variant discovery algorithms to comprehensively analyze three trios to define the full spectrum of human genetic variation in a haplotype-resolved manner. We identify 818,054 indel variants (
- Published
- 2019
19. exRNA Atlas Analysis Reveals Distinct Extracellular RNA Cargo Types and Their Carriers Present across Human Biofluids
- Author
-
Murillo, Oscar D, Thistlethwaite, William, Rozowsky, Joel, Subramanian, Sai Lakshmi, Lucero, Rocco, Shah, Neethu, Jackson, Andrew R, Srinivasan, Srimeenakshi, Chung, Allen, Laurent, Clara D, Kitchen, Robert R, Galeev, Timur, Warrell, Jonathan, Diao, James A, Welsh, Joshua A, Hanspers, Kristina, Riutta, Anders, Burgstaller-Muehlbacher, Sebastian, Shah, Ravi V, Yeri, Ashish, Jenkins, Lisa M, Ahsen, Mehmet E, Cordon-Cardo, Carlos, Dogra, Navneet, Gifford, Stacey M, Smith, Joshua T, Stolovitzky, Gustavo, Tewari, Ashutosh K, Wunsch, Benjamin H, Yadav, Kamlesh K, Danielson, Kirsty M, Filant, Justyna, Moeller, Courtney, Nejad, Parham, Paul, Anu, Simonson, Bridget, Wong, David K, Zhang, Xuan, Balaj, Leonora, Gandhi, Roopali, Sood, Anil K, Alexander, Roger P, Wang, Liang, Wu, Chunlei, Wong, David TW, Galas, David J, Van Keuren-Jensen, Kendall, Patel, Tushar, Jones, Jennifer C, Das, Saumya, Cheung, Kei-Hoi, Pico, Alexander R, Su, Andrew I, Raffai, Robert L, Laurent, Louise C, Roth, Matthew E, Gerstein, Mark B, and Milosavljevic, Aleksandar
- Subjects
Biological Sciences ,Genetics ,Human Genome ,Biotechnology ,Adult ,Body Fluids ,Cell Communication ,Cell-Free Nucleic Acids ,Circulating MicroRNA ,Extracellular Vesicles ,Female ,Humans ,Male ,RNA ,Reproducibility of Results ,Sequence Analysis ,RNA ,Software ,ERCC ,deconvolution ,exRNA ,exosomes ,extracellular RNA ,extracellular vesicles ,lipoproteins ,ribonucleoproteins ,Medical and Health Sciences ,Developmental Biology ,Biological sciences ,Biomedical and clinical sciences - Abstract
To develop a map of cell-cell communication mediated by extracellular RNA (exRNA), the NIH Extracellular RNA Communication Consortium created the exRNA Atlas resource (https://exrna-atlas.org). The Atlas version 4P1 hosts 5,309 exRNA-seq and exRNA qPCR profiles from 19 studies and a suite of analysis and visualization tools. To analyze variation between profiles, we apply computational deconvolution. The analysis leads to a model with six exRNA cargo types (CT1, CT2, CT3A, CT3B, CT3C, CT4), each detectable in multiple biofluids (serum, plasma, CSF, saliva, urine). Five of the cargo types associate with known vesicular and non-vesicular (lipoprotein and ribonucleoprotein) exRNA carriers. To validate utility of this model, we re-analyze an exercise response study by deconvolution to identify physiologically relevant response pathways that were not detected previously. To enable wide application of this model, as part of the exRNA Atlas resource, we provide tools for deconvolution and analysis of user-provided case-control studies.
- Published
- 2019
20. The Extracellular RNA Communication Consortium: Establishing Foundational Knowledge and Technologies for Extracellular RNA Research.
- Author
-
Das, Saumya, Extracellular RNA Communication Consortium, Ansel, K Mark, Bitzer, Markus, Breakefield, Xandra O, Charest, Alain, Galas, David J, Gerstein, Mark B, Gupta, Mihir, Milosavljevic, Aleksandar, McManus, Michael T, Patel, Tushar, Raffai, Robert L, Rozowsky, Joel, Roth, Matthew E, Saugstad, Julie A, Van Keuren-Jensen, Kendall, Weaver, Alissa M, and Laurent, Louise C
- Subjects
Extracellular RNA Communication Consortium ,Humans ,MicroRNAs ,RNA ,Knowledge Bases ,Biomarkers ,Extracellular Vesicles ,Cell-Free Nucleic Acids ,Genetics ,Developmental Biology ,Biological Sciences ,Medical and Health Sciences - Abstract
The Extracellular RNA Communication Consortium (ERCC) was launched to accelerate progress in the new field of extracellular RNA (exRNA) biology and to establish whether exRNAs and their carriers, including extracellular vesicles (EVs), can mediate intercellular communication and be utilized for clinical applications. Phase 1 of the ERCC focused on exRNA/EV biogenesis and function, discovery of exRNA biomarkers, development of exRNA/EV-based therapeutics, and construction of a robust set of reference exRNA profiles for a variety of biofluids. Here, we present progress by ERCC investigators in these areas, and we discuss collaborative projects directed at development of robust methods for EV/exRNA isolation and analysis and tools for sharing and computational analysis of exRNA profiling data.
- Published
- 2019
21. Dynamic quality control machinery that operates across compartmental borders mediates the degradation of mammalian nuclear membrane proteins
- Author
-
Tsai, Pei-Ling, Cameron, Christopher J.F., Forni, Maria Fernanda, Wasko, Renee R., Naughton, Brigitte S., Horsley, Valerie, Gerstein, Mark B., and Schlieker, Christian
- Published
- 2022
- Full Text
- View/download PDF
22. High-coverage whole-genome sequencing of the expanded 1000 Genomes Project cohort including 602 trios
- Author
-
Eichler, Evan E., Korbel, Jan O., Lee, Charles, Marschall, Tobias, Devine, Scott E., Harvey, William T., Zhou, Weichen, Mills, Ryan E., Rausch, Tobias, Kumar, Sushant, Alkan, Can, Hormozdiari, Fereydoun, Chong, Zechen, Chen, Yu, Yang, Xiaofei, Lin, Jiadong, Gerstein, Mark B., Kai, Ye, Zhu, Qihui, Yilmaz, Feyza, Xiao, Chunlin, Byrska-Bishop, Marta, Evani, Uday S., Zhao, Xuefang, Basile, Anna O., Abel, Haley J., Regier, Allison A., Corvelo, André, Clarke, Wayne E., Musunuri, Rajeeva, Nagulapalli, Kshithija, Fairley, Susan, Runnels, Alexi, Winterkorn, Lara, Lowy, Ernesto, Paul Flicek, Germer, Soren, Brand, Harrison, Hall, Ira M., Talkowski, Michael E., Narzisi, Giuseppe, and Zody, Michael C.
- Published
- 2022
- Full Text
- View/download PDF
23. Improved Prediction of Ligand–Protein Binding Affinities by Meta-modeling.
- Author
-
Lee, Ho-Joon, Emani, Prashant S., and Gerstein, Mark B.
- Published
- 2024
- Full Text
- View/download PDF
24. Functional genomics data: privacy risk assessment and technological mitigation
- Author
-
Gürsoy, Gamze, Li, Tianxiao, Liu, Susanna, Ni, Eric, Brannon, Charlotte M., and Gerstein, Mark B.
- Published
- 2022
- Full Text
- View/download PDF
25. Comprehensive functional genomic resource and integrative model for the human brain
- Author
-
Wang, Daifeng, Liu, Shuang, Warrell, Jonathan, Won, Hyejung, Shi, Xu, Navarro, Fabio CP, Clarke, Declan, Gu, Mengting, Emani, Prashant, Yang, Yucheng T, Xu, Min, Gandal, Michael J, Lou, Shaoke, Zhang, Jing, Park, Jonathan J, Yan, Chengfei, Rhie, Suhn Kyong, Manakongtreecheep, Kasidet, Zhou, Holly, Nathan, Aparna, Peters, Mette, Mattei, Eugenio, Fitzgerald, Dominic, Brunetti, Tonya, Moore, Jill, Jiang, Yan, Girdhar, Kiran, Hoffman, Gabriel E, Kalayci, Selim, Gümüş, Zeynep H, Crawford, Gregory E, Roussos, Panos, Akbarian, Schahram, Jaffe, Andrew E, White, Kevin P, Weng, Zhiping, Sestan, Nenad, Geschwind, Daniel H, Knowles, James A, Gerstein, Mark B, Ashley-Koch, Allison E, Garrett, Melanie E, Song, Lingyun, Safi, Alexias, Johnson, Graham D, Wray, Gregory A, Reddy, Timothy E, Goes, Fernando S, Zandi, Peter, Bryois, Julien, Price, Amanda J, Ivanov, Nikolay A, Collado-Torres, Leonardo, Hyde, Thomas M, Burke, Emily E, Kleiman, Joel E, Tao, Ran, Shin, Joo Heon, Kundakovic, Marija, Brown, Leanne, Kassim, Bibi S, Park, Royce B, Wiseman, Jennifer R, Zharovsky, Elizabeth, Jacobov, Rivka, Devillers, Olivia, Flatow, Elie, Lipska, Barbara K, Lewis, David A, Haroutunian, Vahram, Hahn, Chang-Gyu, Charney, Alexander W, Dracheva, Stella, Kozlenkov, Alexey, Belmont, Judson, DelValle, Diane, Francoeur, Nancy, Hadjimichael, Evi, Pinto, Dalila, van Bakel, Harm, Fullard, John F, Bendl, Jaroslav, Hauberg, Mads E, Mangravite, Lara M, Peters, Mette A, Chae, Yooree, Peng, Junmin, Niu, Mingming, Wang, Xusheng, Webster, Maree J, Beach, Thomas G, Chen, Chao, and Jiang, Yi
- Subjects
Human Genome ,Biotechnology ,Schizophrenia ,Mental Health ,Brain Disorders ,Genetics ,Neurosciences ,1.1 Normal biological development and functioning ,2.1 Biological and endogenous factors ,Underpinning research ,Aetiology ,Mental health ,Brain ,Datasets as Topic ,Deep Learning ,Enhancer Elements ,Genetic ,Epigenesis ,Genetic ,Epigenomics ,Gene Expression Regulation ,Gene Regulatory Networks ,Genome-Wide Association Study ,Humans ,Mental Disorders ,Quantitative Trait Loci ,Single-Cell Analysis ,Transcriptome ,PsychENCODE Consortium ,General Science & Technology - Abstract
Despite progress in defining genetic risk for psychiatric disorders, their molecular mechanisms remain elusive. Addressing this, the PsychENCODE Consortium has generated a comprehensive online resource for the adult brain across 1866 individuals. The PsychENCODE resource contains ~79,000 brain-active enhancers, sets of Hi-C linkages, and topologically associating domains; single-cell expression profiles for many cell types; expression quantitative-trait loci (QTLs); and further QTLs associated with chromatin, splicing, and cell-type proportions. Integration shows that varying cell-type proportions largely account for the cross-population variation in expression (with >88% reconstruction accuracy). It also allows building of a gene regulatory network, linking genome-wide association study variants to genes (e.g., 321 for schizophrenia). We embed this network into an interpretable deep-learning model, which improves disease prediction by ~6-fold versus polygenic risk scores and identifies key genes and pathways in psychiatric disorders.
- Published
- 2018
26. Integrative functional genomic analysis of human brain development and neuropsychiatric risks
- Author
-
Li, Mingfeng, Santpere, Gabriel, Imamura Kawasawa, Yuka, Evgrafov, Oleg V, Gulden, Forrest O, Pochareddy, Sirisha, Sunkin, Susan M, Li, Zhen, Shin, Yurae, Zhu, Ying, Sousa, André MM, Werling, Donna M, Kitchen, Robert R, Kang, Hyo Jung, Pletikos, Mihovil, Choi, Jinmyung, Muchnik, Sydney, Xu, Xuming, Wang, Daifeng, Lorente-Galdos, Belen, Liu, Shuang, Giusti-Rodríguez, Paola, Won, Hyejung, de Leeuw, Christiaan A, Pardiñas, Antonio F, Hu, Ming, Jin, Fulai, Li, Yun, Owen, Michael J, O’Donovan, Michael C, Walters, James TR, Posthuma, Danielle, Reimers, Mark A, Levitt, Pat, Weinberger, Daniel R, Hyde, Thomas M, Kleinman, Joel E, Geschwind, Daniel H, Hawrylycz, Michael J, State, Matthew W, Sanders, Stephan J, Sullivan, Patrick F, Gerstein, Mark B, Lein, Ed S, Knowles, James A, Sestan, Nenad, Willsey, A Jeremy, Oldre, Aaron, Szafer, Aaron, Camarena, Adrian, Cherskov, Adriana, Charney, Alexander W, Abyzov, Alexej, Kozlenkov, Alexey, Safi, Alexias, Jones, Allan R, Ashley-Koch, Allison E, Ebbert, Amanda, Price, Amanda J, Sekijima, Amanda, Kefi, Amira, Bernard, Amy, Amiri, Anahita, Sboner, Andrea, Clark, Andrew, Jaffe, Andrew E, Tebbenkamp, Andrew TN, Sodt, Andy J, Guillozet-Bongaarts, Angie L, Nairn, Angus C, Carey, Anita, Huttner, Anita, Chervenak, Ann, Szekely, Anna, Shieh, Annie W, Harmanci, Arif, Lipska, Barbara K, Carlyle, Becky C, Gregor, Ben W, Kassim, Bibi S, Sheppard, Brooke, Bichsel, Candace, Hahn, Chang-Gyu, Lee, Chang-Kyu, Chen, Chao, Kuan, Chihchau L, Dang, Chinh, Lau, Chris, Cuhaciyan, Christine, Armoskus, Christoper, Mason, Christopher E, Liu, Chunyu, Slaughterbeck, Cliff R, Bennet, Crissa, Pinto, Dalila, Polioudakis, Damon, Franjic, Daniel, Miller, Daniel J, Bertagnolli, Darren, and Lewis, David A
- Subjects
Human Genome ,Genetics ,Neurosciences ,Biotechnology ,Pediatric ,Mental Health ,Brain ,Epigenesis ,Genetic ,Epigenomics ,Gene Expression Regulation ,Developmental ,Gene Regulatory Networks ,Humans ,Mental Disorders ,Nervous System Diseases ,Neurogenesis ,Single-Cell Analysis ,Transcriptome ,BrainSpan Consortium ,PsychENCODE Consortium ,PsychENCODE Developmental Subgroup ,General Science & Technology - Abstract
To broaden our understanding of human neurodevelopment, we profiled transcriptomic and epigenomic landscapes across brain regions and/or cell types for the entire span of prenatal and postnatal development. Integrative analysis revealed temporal, regional, sex, and cell type-specific dynamics. We observed a global transcriptomic cup-shaped pattern, characterized by a late fetal transition associated with sharply decreased regional differences and changes in cellular composition and maturation, followed by a reversal in childhood-adolescence, and accompanied by epigenomic reorganizations. Analysis of gene coexpression modules revealed relationships with epigenomic regulation and neurodevelopmental processes. Genes with genetic associations to brain-based traits and neuropsychiatric disorders (including MEF2C, SATB2, SOX5, TCF4, and TSHZ3) converged in a small number of modules and distinct cell types, revealing insights into neurodevelopment and the genomic basis of neuropsychiatric risks.
- Published
- 2018
27. Mako: A Graph-based Pattern Growth Approach to Detect Complex Structural Variants
- Author
-
Gerstein, Mark B., Sanders, Ashley D., Zody, Micheal C., Talkowski, Michael E., Mills, Ryan E., Korbel, Jan O., Marschall, Tobias, Ebert, Peter, Audano, Peter A., Rodriguez-Martin, Bernardo, Porubsky, David, Jan Bonder, Marc, Sulovari, Arvis, Ebler, Jana, Zhou, Weichen, Serra Mari, Rebecca, Yilmaz, Feyza, Zhao, Xuefang, Hsieh, PingHsun, Lee, Joyce, Kumar, Sushant, Rausch, Tobias, Chen, Yu, Chong, Zechen, Munson, Katherine M., Chaisson, Mark J.P., Chen, Junjie, Shi, Xinghua, Wenger, Aaron M., Harvey, William T., Hansenfeld, Patrick, Regier, Allison, Hall, Ira M., Flicek, Paul, Hastie, Alex R., Fairely, Susan, Lin, Jiadong, Yang, Xiaofei, Kosters, Walter, Xu, Tun, Jia, Yanyan, Wang, Songbo, Zhu, Qihui, Ryan, Mallory, Guo, Li, Zhang, Chengsheng, Lee, Charles, Devine, Scott E., Eichler, Evan E., and Ye, Kai
- Published
- 2022
- Full Text
- View/download PDF
28. Nodal modulator (NOMO) is required to sustain endoplasmic reticulum morphology
- Author
-
Amaya, Catherine, Cameron, Christopher J.F., Devarkar, Swapnil C., Seager, Sebastian J.H., Gerstein, Mark B., Xiong, Yong, and Schlieker, Christian
- Published
- 2021
- Full Text
- View/download PDF
29. Molecular and cellular reorganization of neural circuits in the human lineage
- Author
-
Sousa, André MM, Zhu, Ying, Raghanti, Mary Ann, Kitchen, Robert R, Onorati, Marco, Tebbenkamp, Andrew TN, Stutz, Bernardo, Meyer, Kyle A, Li, Mingfeng, Kawasawa, Yuka Imamura, Liu, Fuchen, Perez, Raquel Garcia, Mele, Marta, Carvalho, Tiago, Skarica, Mario, Gulden, Forrest O, Pletikos, Mihovil, Shibata, Akemi, Stephenson, Alexa R, Edler, Melissa K, Ely, John J, Elsworth, John D, Horvath, Tamas L, Hof, Patrick R, Hyde, Thomas M, Kleinman, Joel E, Weinberger, Daniel R, Reimers, Mark, Lifton, Richard P, Mane, Shrikant M, Noonan, James P, State, Matthew W, Lein, Ed S, Knowles, James A, Marques-Bonet, Tomas, Sherwood, Chet C, Gerstein, Mark B, and Sestan, Nenad
- Subjects
Biological Sciences ,Bioinformatics and Computational Biology ,Genetics ,Neurosciences ,Human Genome ,1.1 Normal biological development and functioning ,Neurological ,Animals ,Gene Expression Profiling ,Humans ,Interneurons ,Macaca ,Neocortex ,Neural Pathways ,Pan troglodytes ,Phylogeny ,Species Specificity ,Transcriptome ,General Science & Technology - Abstract
To better understand the molecular and cellular differences in brain organization between human and nonhuman primates, we performed transcriptome sequencing of 16 regions of adult human, chimpanzee, and macaque brains. Integration with human single-cell transcriptomic data revealed global, regional, and cell-type-specific species expression differences in genes representing distinct functional categories. We validated and further characterized the human specificity of genes enriched in distinct cell types through histological and functional analyses, including rare subpallial-derived interneurons expressing dopamine biosynthesis genes enriched in the human striatum and absent in the nonhuman African ape neocortex. Our integrated analysis of the generated data revealed diverse molecular and cellular features of the phylogenetic reorganization of the human brain across multiple levels, with relevance for brain function and disease.
- Published
- 2017
30. Establishing a Global Standard for Wearable Devices in Sport and Exercise Medicine: Perspectives from Academic and Industry Stakeholders
- Author
-
Ash, Garrett I., Stults-Kolehmainen, Matthew, Busa, Michael A., Gaffey, Allison E., Angeloudis, Konstantinos, Muniz-Pardos, Borja, Gregory, Robert, Huggins, Robert A., Redeker, Nancy S., Weinzimer, Stuart A., Grieco, Lauren A., Lyden, Kate, Megally, Esmeralda, Vogiatzis, Ioannis, Scher, LaurieAnn, Zhu, Xinxin, Baker, Julien S., Brandt, Cynthia, Businelle, Michael S., Fucito, Lisa M., Griggs, Stephanie, Jarrin, Robert, Mortazavi, Bobak J., Prioleau, Temiloluwa, Roberts, Walter, Spanakis, Elias K., Nally, Laura M., Debruyne, Andre, Bachl, Norbert, Pigozzi, Fabio, Halabchi, Farzin, Ramagole, Dimakatso A., Janse van Rensburg, Dina C., Wolfarth, Bernd, Fossati, Chiara, Rozenstoka, Sandra, Tanisawa, Kumpei, Börjesson, Mats, Casajus, José Antonio, Gonzalez-Aguero, Alex, Zelenkova, Irina, Swart, Jeroen, Gursoy, Gamze, Meyerson, William, Liu, Jason, Greenbaum, Dov, Pitsiladis, Yannis P., and Gerstein, Mark B.
- Published
- 2021
- Full Text
- View/download PDF
31. Impact and characterization of serial structural variations across humans and great apes.
- Author
-
Höps, Wolfram, Rausch, Tobias, Jendrusch, Michael, Human Genome Structural Variation Consortium (HGSVC), Ashraf, Hufsah, Audano, Peter A., Austine, Ola, Basile, Anna O., Beck, Christine R., Jan Bonder, Marc, Byrska-Bishop, Marta, Chaisson, Mark J. P., Chong, Zechen, Corvelo, André, Devine, Scott E., Ebert, Peter, Ebler, Jana, Eichler, Evan E., Gerstein, Mark B., and Hallast, Pille
- Subjects
GENE rearrangement ,HOMOLOGOUS recombination ,HOMINIDS ,GENOMES ,ALLELES - Abstract
Modern sequencing technology enables the systematic detection of complex structural variation (SV) across genomes. However, extensive DNA rearrangements arising through a series of mutations, a phenomenon we refer to as serial SV (sSV), remain underexplored, posing a challenge for SV discovery. Here, we present NAHRwhals (https://github.com/WHops/NAHRwhals), a method to infer repeat-mediated series of SVs in long-read genomic assemblies. Applying NAHRwhals to haplotype-resolved human genomes from 28 individuals reveals 37 sSV loci of various length and complexity. These sSVs explain otherwise cryptic variation in medically relevant regions such as the TPSAB1 gene, 8p23.1, 22q11 and Sotos syndrome regions. Comparisons with great ape assemblies indicate that most human sSVs formed recently, after the human-ape split, and involved non-repeat-mediated processes in addition to non-allelic homologous recombination. NAHRwhals reliably discovers and characterizes sSVs at scale and independent of species, uncovering their genomic abundance and suggesting broader implications for disease. Structural variants (SV) can accumulate in repeat-rich parts of the genome and transform them in unexpected ways. Here, with their new assembly-based genotyper (NAHRwhals), the authors verify multi-step SVs in 37 human loci and identify alleles at risk for copy-number variation. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
32. Dermal Adipocyte Lipolysis and Myofibroblast Conversion Are Required for Efficient Skin Repair
- Author
-
Shook, Brett A., Wasko, Renee R., Mano, Omer, Rutenberg-Schoenberg, Michael, Rudolph, Michael C., Zirak, Bahar, Rivera-Gonzalez, Guillermo C., López-Giráldez, Francesc, Zarini, Simona, Rezza, Amélie, Clark, Damon A., Rendl, Michael, Rosenblum, Michael D., Gerstein, Mark B., and Horsley, Valerie
- Published
- 2020
- Full Text
- View/download PDF
33. Passenger Mutations in More Than 2,500 Cancer Genomes: Overall Molecular Functional Impact and Consequences
- Author
-
Kumar, Sushant, Warrell, Jonathan, Li, Shantao, McGillivray, Patrick D., Meyerson, William, Salichos, Leonidas, Harmanci, Arif, Martinez-Fundichely, Alexander, Chan, Calvin W.Y., Nielsen, Morten Muhlig, Lochovsky, Lucas, Zhang, Yan, Li, Xiaotong, Lou, Shaoke, Pedersen, Jakob Skou, Herrmann, Carl, Getz, Gad, Khurana, Ekta, and Gerstein, Mark B.
- Published
- 2020
- Full Text
- View/download PDF
34. Extending gene ontology in the context of extracellular RNA and vesicle communication
- Author
-
Cheung, Kei-Hoi, Keerthikumar, Shivakumar, Roncaglia, Paola, Subramanian, Sai Lakshmi, Roth, Matthew E, Samuel, Monisha, Anand, Sushma, Gangoda, Lahiru, Gould, Stephen, Alexander, Roger, Galas, David, Gerstein, Mark B, Hill, Andrew F, Kitchen, Robert R, Lötvall, Jan, Patel, Tushar, Procaccini, Dena C, Quesenberry, Peter, Rozowsky, Joel, Raffai, Robert L, Shypitsyna, Aleksandra, Su, Andrew I, Théry, Clotilde, Vickers, Kasey, Wauben, Marca HM, Mathivanan, Suresh, Milosavljevic, Aleksandar, and Laurent, Louise C
- Subjects
Information and Computing Sciences ,Genetics ,Databases ,Genetic ,Extracellular Vesicles ,Gene Ontology ,Humans ,Molecular Sequence Annotation ,RNA ,Web Browser ,Ontology ,Extracellular RNA ,Extracellular vesicle ,Metadata ,Faceted search ,Atlas ,Other Biological Sciences ,Artificial Intelligence and Image Processing ,Information Systems ,Information and computing sciences - Abstract
BackgroundTo address the lack of standard terminology to describe extracellular RNA (exRNA) data/metadata, we have launched an inter-community effort to extend the Gene Ontology (GO) with subcellular structure concepts relevant to the exRNA domain. By extending GO in this manner, the exRNA data/metadata will be more easily annotated and queried because it will be based on a shared set of terms and relationships relevant to extracellular research.MethodsBy following a consensus-building process, we have worked with several academic societies/consortia, including ERCC, ISEV, and ASEMV, to identify and approve a set of exRNA and extracellular vesicle-related terms and relationships that have been incorporated into GO. In addition, we have initiated an ongoing process of extractions of gene product annotations associated with these terms from Vesiclepedia and ExoCarta, conversion of the extracted annotations to Gene Association File (GAF) format for batch submission to GO, and curation of the submitted annotations by the GO Consortium. As a use case, we have incorporated some of the GO terms into annotations of samples from the exRNA Atlas and implemented a faceted search interface based on such annotations.ResultsWe have added 7 new terms and modified 9 existing terms (along with their synonyms and relationships) to GO. Additionally, 18,695 unique coding gene products (mRNAs and proteins) and 963 unique non-coding gene products (ncRNAs) which are associated with the terms: "extracellular vesicle", "extracellular exosome", "apoptotic body", and "microvesicle" were extracted from ExoCarta and Vesiclepedia. These annotations are currently being processed for submission to GO.ConclusionsAs an inter-community effort, we have made a substantial update to GO in the exRNA context. We have also demonstrated the utility of some of the new GO terms for sample annotation and metadata search.
- Published
- 2016
35. Leveraging protein dynamics to identify cancer mutational hotspots using 3D structures
- Author
-
Kumar, Sushant, Clarke, Declan, and Gerstein, Mark B.
- Published
- 2019
36. The PsychENCODE project
- Author
-
Akbarian, Schahram, Liu, Chunyu, Knowles, James A, Vaccarino, Flora M, Farnham, Peggy J, Crawford, Gregory E, Jaffe, Andrew E, Pinto, Dalila, Dracheva, Stella, Geschwind, Daniel H, Mill, Jonathan, Nairn, Angus C, Abyzov, Alexej, Pochareddy, Sirisha, Prabhakar, Shyam, Weissman, Sherman, Sullivan, Patrick F, State, Matthew W, Weng, Zhiping, Peters, Mette A, White, Kevin P, Gerstein, Mark B, Amiri, Anahita, Armoskus, Chris, Ashley-Koch, Allison E, Bae, Taejeong, Beckel-Mitchener, Andrea, Berman, Benjamin P, Coetzee, Gerhard A, Coppola, Gianfilippo, Francoeur, Nancy, Fromer, Menachem, Gao, Robert, Grennan, Kay, Herstein, Jennifer, Kavanagh, David H, Ivanov, Nikolay A, Jiang, Yan, Kitchen, Robert R, Kozlenkov, Alexey, Kundakovic, Marija, Li, Mingfeng, Li, Zhen, Liu, Shuang, Mangravite, Lara M, Mattei, Eugenio, Markenscoff-Papadimitriou, Eirene, Navarro, Fábio CP, North, Nicole, Omberg, Larsson, Panchision, David, Parikshak, Neelroop, Poschmann, Jeremie, Price, Amanda J, Purcaro, Michael, Reddy, Timothy E, Roussos, Panos, Schreiner, Shannon, Scuderi, Soraya, Sebra, Robert, Shibata, Mikihito, Shieh, Annie W, Skarica, Mario, Sun, Wenjie, Swarup, Vivek, Thomas, Amber, Tsuji, Junko, van Bakel, Harm, Wang, Daifeng, Wang, Yongjun, Wang, Kai, Werling, Donna M, Willsey, A Jeremy, Witt, Heather, Won, Hyejung, Wong, Chloe CY, Wray, Gregory A, Wu, Emily Y, Xu, Xuming, Yao, Lijing, Senthil, Geetha, Lehner, Thomas, Sklar, Pamela, and Sestan, Nenad
- Subjects
Biotechnology ,Human Genome ,Brain Disorders ,Genetics ,Mental Health ,Serious Mental Illness ,Neurosciences ,Schizophrenia ,Intellectual and Developmental Disabilities (IDD) ,Autism ,2.1 Biological and endogenous factors ,1.1 Normal biological development and functioning ,Underpinning research ,Aetiology ,Mental health ,Animals ,Brain ,Chromosome Mapping ,Epigenesis ,Genetic ,Genetic Code ,Humans ,Mental Disorders ,Transcriptome ,PsychENCODE Consortium ,Psychology ,Cognitive Sciences ,Neurology & Neurosurgery - Abstract
Recent research on disparate psychiatric disorders has implicated rare variants in genes involved in global gene regulation and chromatin modification, as well as many common variants located primarily in regulatory regions of the genome. Understanding precisely how these variants contribute to disease will require a deeper appreciation for the mechanisms of gene regulation in the developing and adult human brain. The PsychENCODE project aims to produce a public resource of multidimensional genomic data using tissue- and cell type–specific samples from approximately 1,000 phenotypically well-characterized, high-quality healthy and disease-affected human post-mortem brains, as well as functionally characterize disease-associated regulatory elements and variants in model systems. We are beginning with a focus on autism spectrum disorder, bipolar disorder and schizophrenia, and expect that this knowledge will apply to a wide variety of psychiatric disorders. This paper outlines the motivation and design of PsychENCODE.
- Published
- 2015
37. The Molecular Taxonomy of Primary Prostate Cancer
- Author
-
Network, The Cancer Genome Atlas Research, Abeshouse, Adam, Ahn, Jaeil, Akbani, Rehan, Ally, Adrian, Amin, Samirkumar, Andry, Christopher D, Annala, Matti, Aprikian, Armen, Armenia, Joshua, Arora, Arshi, Auman, J Todd, Balasundaram, Miruna, Balu, Saianand, Barbieri, Christopher E, Bauer, Thomas, Benz, Christopher C, Bergeron, Alain, Beroukhim, Rameen, Berrios, Mario, Bivol, Adrian, Bodenheimer, Tom, Boice, Lori, Bootwalla, Moiz S, dos Reis, Rodolfo Borges, Boutros, Paul C, Bowen, Jay, Bowlby, Reanne, Boyd, Jeffrey, Bradley, Robert K, Breggia, Anne, Brimo, Fadi, Bristow, Christopher A, Brooks, Denise, Broom, Bradley M, Bryce, Alan H, Bubley, Glenn, Burks, Eric, Butterfield, Yaron SN, Button, Michael, Canes, David, Carlotti, Carlos G, Carlsen, Rebecca, Carmel, Michel, Carroll, Peter R, Carter, Scott L, Cartun, Richard, Carver, Brett S, Chan, June M, Chang, Matthew T, Chen, Yu, Cherniack, Andrew D, Chevalier, Simone, Chin, Lynda, Cho, Juok, Chu, Andy, Chuah, Eric, Chudamani, Sudha, Cibulskis, Kristian, Ciriello, Giovanni, Clarke, Amanda, Cooperberg, Matthew R, Corcoran, Niall M, Costello, Anthony J, Cowan, Janet, Crain, Daniel, Curley, Erin, David, Kerstin, Demchok, John A, Demichelis, Francesca, Dhalla, Noreen, Dhir, Rajiv, Doueik, Alexandre, Drake, Bettina, Dvinge, Heidi, Dyakova, Natalya, Felau, Ina, Ferguson, Martin L, Frazer, Scott, Freedland, Stephen, Fu, Yao, Gabriel, Stacey B, Gao, Jianjiong, Gardner, Johanna, Gastier-Foster, Julie M, Gehlenborg, Nils, Gerken, Mark, Gerstein, Mark B, Getz, Gad, Godwin, Andrew K, Gopalan, Anuradha, Graefen, Markus, Graim, Kiley, Gribbin, Thomas, Guin, Ranabir, Gupta, Manaswi, Hadjipanayis, Angela, Haider, Syed, Hamel, Lucie, and Hayes, D Neil
- Subjects
Biomedical and Clinical Sciences ,Clinical Sciences ,Oncology and Carcinogenesis ,Cancer ,Cancer Genomics ,Human Genome ,Aging ,Prostate Cancer ,Urologic Diseases ,Genetics ,2.1 Biological and endogenous factors ,Good Health and Well Being ,DNA Repair ,Epigenesis ,Genetic ,Gene Expression Regulation ,Neoplastic ,Gene Fusion ,Humans ,Male ,Mutation ,Neoplasm Metastasis ,Phosphatidylinositol 3-Kinases ,Prostatic Neoplasms ,Receptors ,Androgen ,Signal Transduction ,ras Proteins ,Cancer Genome Atlas Research Network ,Biological Sciences ,Medical and Health Sciences ,Developmental Biology ,Biological sciences ,Biomedical and clinical sciences - Abstract
There is substantial heterogeneity among primary prostate cancers, evident in the spectrum of molecular abnormalities and its variable clinical course. As part of The Cancer Genome Atlas (TCGA), we present a comprehensive molecular analysis of 333 primary prostate carcinomas. Our results revealed a molecular taxonomy in which 74% of these tumors fell into one of seven subtypes defined by specific gene fusions (ERG, ETV1/4, and FLI1) or mutations (SPOP, FOXA1, and IDH1). Epigenetic profiles showed substantial heterogeneity, including an IDH1 mutant subset with a methylator phenotype. Androgen receptor (AR) activity varied widely and in a subtype-specific manner, with SPOP and FOXA1 mutant tumors having the highest levels of AR-induced transcripts. 25% of the prostate cancers had a presumed actionable lesion in the PI3K or MAPK signaling pathways, and DNA repair genes were inactivated in 19%. Our analysis reveals molecular heterogeneity among primary prostate cancers, as well as potentially actionable molecular defects.
- Published
- 2015
38. An integrated map of structural variation in 2,504 human genomes.
- Author
-
Sudmant, Peter H, Rausch, Tobias, Gardner, Eugene J, Handsaker, Robert E, Abyzov, Alexej, Huddleston, John, Zhang, Yan, Ye, Kai, Jun, Goo, Fritz, Markus Hsi-Yang, Konkel, Miriam K, Malhotra, Ankit, Stütz, Adrian M, Shi, Xinghua, Casale, Francesco Paolo, Chen, Jieming, Hormozdiari, Fereydoun, Dayama, Gargi, Chen, Ken, Malig, Maika, Chaisson, Mark JP, Walter, Klaudia, Meiers, Sascha, Kashin, Seva, Garrison, Erik, Auton, Adam, Lam, Hugo YK, Mu, Xinmeng Jasmine, Alkan, Can, Antaki, Danny, Bae, Taejeong, Cerveira, Eliza, Chines, Peter, Chong, Zechen, Clarke, Laura, Dal, Elif, Ding, Li, Emery, Sarah, Fan, Xian, Gujral, Madhusudan, Kahveci, Fatma, Kidd, Jeffrey M, Kong, Yu, Lameijer, Eric-Wubbo, McCarthy, Shane, Flicek, Paul, Gibbs, Richard A, Marth, Gabor, Mason, Christopher E, Menelaou, Androniki, Muzny, Donna M, Nelson, Bradley J, Noor, Amina, Parrish, Nicholas F, Pendleton, Matthew, Quitadamo, Andrew, Raeder, Benjamin, Schadt, Eric E, Romanovitch, Mallory, Schlattl, Andreas, Sebra, Robert, Shabalin, Andrey A, Untergasser, Andreas, Walker, Jerilyn A, Wang, Min, Yu, Fuli, Zhang, Chengsheng, Zhang, Jing, Zheng-Bradley, Xiangqun, Zhou, Wanding, Zichner, Thomas, Sebat, Jonathan, Batzer, Mark A, McCarroll, Steven A, 1000 Genomes Project Consortium, Mills, Ryan E, Gerstein, Mark B, Bashir, Ali, Stegle, Oliver, Devine, Scott E, Lee, Charles, Eichler, Evan E, and Korbel, Jan O
- Subjects
Genomes Project Consortium ,Humans ,Genetic Predisposition to Disease ,Physical Chromosome Mapping ,Sequence Analysis ,DNA ,Genetics ,Medical ,Genetics ,Population ,Genomics ,Sequence Deletion ,Amino Acid Sequence ,Genotype ,Haplotypes ,Homozygote ,Polymorphism ,Single Nucleotide ,Quantitative Trait Loci ,Genome ,Human ,Molecular Sequence Data ,Genetic Variation ,Genome-Wide Association Study ,Mutation Rate ,Sequence Analysis ,DNA ,Genetics ,Medical ,Population ,Polymorphism ,Single Nucleotide ,Genome ,Human ,General Science & Technology - Abstract
Structural variants are implicated in numerous diseases and make up the majority of varying nucleotides among human genomes. Here we describe an integrated set of eight structural variant classes comprising both balanced and unbalanced variants, which we constructed using short-read DNA sequencing data and statistically phased onto haplotype blocks in 26 human populations. Analysing this set, we identify numerous gene-intersecting structural variants exhibiting population stratification and describe naturally occurring homozygous gene knockouts that suggest the dispensability of a variety of human genes. We demonstrate that structural variants are enriched on haplotypes identified by genome-wide association studies and exhibit enrichment for expression quantitative trait loci. Additionally, we uncover appreciable levels of structural variant complexity at different scales, including genic loci subject to clusters of repeated rearrangement and complex structural variants with multiple breakpoints likely to have formed through individual mutational events. Our catalogue will enhance future studies into structural variant demography, functional impact and disease association.
- Published
- 2015
39. Network propagation-based prioritization of long tail genes in 17 cancer types
- Author
-
Mohsen, Hussein, Gunasekharan, Vignesh, Qing, Tao, Seay, Montrell, Surovtseva, Yulia, Negahban, Sahand, Szallasi, Zoltan, Pusztai, Lajos, and Gerstein, Mark B.
- Published
- 2021
- Full Text
- View/download PDF
40. The Extracellular RNA Communication Consortium: Establishing Foundational Knowledge and Technologies for Extracellular RNA Research
- Author
-
Abdel-Mageed, Asim B., Adamidi, Catherine, Adelson, P. David, Akat, Kemal M., Alsop, Eric, Ansel, K. Mark, Arango, Jorge, Aronin, Neil, Avsaroglu, Seda Kilinc, Azizian, Azadeh, Balaj, Leonora, Ben-Dov, Iddo Z., Bertram, Karl, Bitzer, Markus, Blelloch, Robert, Bogardus, Kimberly A., Breakefield, Xandra Owens, Calin, George A., Carter, Bob S., Charest, Al, Chen, Clark C., Chitnis, Tanuja, Coffey, Robert J., Courtright-Lim, Amanda, Das, Saumya, Datta, Amrita, DeHoff, Peter, Diacovo, Thomas G., Erle, David J., Etheridge, Alton, Ferrer, Marc, Franklin, Jeffrey L., Freedman, Jane E., Galas, David J., Galeev, Timur, Gandhi, Roopali, Garcia, Aitor, Gerstein, Mark Bender, Ghai, Vikas, Ghiran, Ionita Calin, Giraldez, Maria D., Goga, Andrei, Gogakos, Tasos, Goilav, Beatrice, Gould, Stephen J., Guo, Peixuan, Gupta, Mihir, Hochberg, Fred, Huang, Bo, Huentelman, Matt, Hunter, Craig, Hutchins, Elizabeth, Jackson, Andrew R., Kalani, M. Yashar S., Kanlikilicer, Pinar, Karaszti, Reka Agnes, Van Keuren-Jensen, Kendall, Khvorova, Anastasia, Kim, Yong, Kim, Hogyoung, Kim, Taek Kyun, Kitchen, Robert, Kraig, Richard P., Krichevsky, Anna M., Kwong, Raymond Y., Laurent, Louise C., Lee, Minyoung, L’Etoile, Noelle, Levy, Shawn E., Li, Feng, Li, Jenny, Li, Xin, Lopez-Berestein, Gabriel, Lucero, Rocco, Mateescu, Bogdan, Matin, A.C., Max, Klaas E.A., McManus, Michael T., Mempel, Thorsten R., Meyer, Cindy, Milosavljevic, Aleksandar, Mondal, Debasis, Mukamal, Kenneth Jay, Murillo, Oscar D., Muthukumar, Thangamani, Nickerson, Deborah A., O’Donnell, Christopher J., Patel, Dinshaw J., Patel, Tushar, Patton, James G., Paul, Anu, Peskind, Elaine R., Phelps, Mitch A., Putterman, Chaim, Quesenberry, Peter J., Quinn, Joseph F., Raffai, Robert L., Ranabothu, Saritha, Rao, Shannon Jiang, Rodriguez-Aguayo, Cristian, Rosenzweig, Anthony, Roth, Matthew E., Rozowsky, Joel, Sabatine, Marc S., Sakhanenko, Nikita A., Saugstad, Julie Anne, Schmittgen, Thomas D., Shah, Neethu, Shah, Ravi, Shedden, Kerby, Shi, Jian, Sood, Anil K., Sopeyin, Anuoluwapo, Spengler, Ryan M., Spetzler, Robert, Srinivasan, Srimeenakshi, Subramanian, Sai Lakshmi, Suthanthiran, Manikkam, Tanriverdi, Kahraman, Teng, Yun, Tewari, Muneesh, Thistlethwaite, William, Tuschl, Thomas, Urbanowicz, Karolina Kaczor, Vickers, Kasey C., Voinnet, Olivier, Wang, Kai, Weaver, Alissa M., Wei, Zhiyun, Weiner, Howard L., Weiss, Zachary R., Williams, Zev, Wong, David T.W., Woodruff, Prescott G., Xiao, Xinshu, Yan, Irene K., Yeri, Ashish, Zhang, Bing, Zhang, Huang-Ge, Breakefield, Xandra O., Charest, Alain, Gerstein, Mark B., and Saugstad, Julie A.
- Published
- 2019
- Full Text
- View/download PDF
41. Integration of extracellular RNA profiling data using metadata, biomedical ontologies and Linked Data technologies.
- Author
-
Subramanian, Sai Lakshmi, Kitchen, Robert R, Alexander, Roger, Carter, Bob S, Cheung, Kei-Hoi, Laurent, Louise C, Pico, Alexander, Roberts, Lewis R, Roth, Matthew E, Rozowsky, Joel S, Su, Andrew I, Gerstein, Mark B, and Milosavljevic, Aleksandar
- Subjects
DMRR ,ERC Consortium ,exRNA ,exRNA Atlas ,exRNA Portal ,Biochemistry and Cell Biology - Abstract
The large diversity and volume of extracellular RNA (exRNA) data that will form the basis of the exRNA Atlas generated by the Extracellular RNA Communication Consortium pose a substantial data integration challenge. We here present the strategy that is being implemented by the exRNA Data Management and Resource Repository, which employs metadata, biomedical ontologies and Linked Data technologies, such as Resource Description Framework to integrate a diverse set of exRNA profiles into an exRNA Atlas and enable integrative exRNA analysis. We focus on the following three specific data integration tasks: (a) selection of samples from a virtual biorepository for exRNA profiling and for inclusion in the exRNA Atlas; (b) retrieval of a data slice from the exRNA Atlas for integrative analysis and (c) interpretation of exRNA analysis results in the context of pathways and networks. As exRNA profiling gains wide adoption in the research community, we anticipate that the strategies discussed here will increasingly be required to enable data reuse and to facilitate integrative analysis of exRNA data.
- Published
- 2015
42. Comparative analysis of the transcriptome across distant species.
- Author
-
Gerstein, Mark B, Rozowsky, Joel, Yan, Koon-Kiu, Wang, Daifeng, Cheng, Chao, Brown, James B, Davis, Carrie A, Hillier, LaDeana, Sisu, Cristina, Li, Jingyi Jessica, Pei, Baikang, Harmanci, Arif O, Duff, Michael O, Djebali, Sarah, Alexander, Roger P, Alver, Burak H, Auerbach, Raymond, Bell, Kimberly, Bickel, Peter J, Boeck, Max E, Boley, Nathan P, Booth, Benjamin W, Cherbas, Lucy, Cherbas, Peter, Di, Chao, Dobin, Alex, Drenkow, Jorg, Ewing, Brent, Fang, Gang, Fastuca, Megan, Feingold, Elise A, Frankish, Adam, Gao, Guanjun, Good, Peter J, Guigó, Roderic, Hammonds, Ann, Harrow, Jen, Hoskins, Roger A, Howald, Cédric, Hu, Long, Huang, Haiyan, Hubbard, Tim JP, Huynh, Chau, Jha, Sonali, Kasper, Dionna, Kato, Masaomi, Kaufman, Thomas C, Kitchen, Robert R, Ladewig, Erik, Lagarde, Julien, Lai, Eric, Leng, Jing, Lu, Zhi, MacCoss, Michael, May, Gemma, McWhirter, Rebecca, Merrihew, Gennifer, Miller, David M, Mortazavi, Ali, Murad, Rabi, Oliver, Brian, Olson, Sara, Park, Peter J, Pazin, Michael J, Perrimon, Norbert, Pervouchine, Dmitri, Reinke, Valerie, Reymond, Alexandre, Robinson, Garrett, Samsonova, Anastasia, Saunders, Gary I, Schlesinger, Felix, Sethi, Anurag, Slack, Frank J, Spencer, William C, Stoiber, Marcus H, Strasbourger, Pnina, Tanzer, Andrea, Thompson, Owen A, Wan, Kenneth H, Wang, Guilin, Wang, Huaien, Watkins, Kathie L, Wen, Jiayu, Wen, Kejia, Xue, Chenghai, Yang, Li, Yip, Kevin, Zaleski, Chris, Zhang, Yan, Zheng, Henry, Brenner, Steven E, Graveley, Brenton R, Celniker, Susan E, Gingeras, Thomas R, and Waterston, Robert
- Subjects
Chromatin ,Animals ,Humans ,Drosophila melanogaster ,Caenorhabditis elegans ,Histones ,RNA ,Untranslated ,Cluster Analysis ,Gene Expression Profiling ,Sequence Analysis ,RNA ,Gene Expression Regulation ,Developmental ,Larva ,Pupa ,Models ,Genetic ,Promoter Regions ,Genetic ,Molecular Sequence Annotation ,Transcriptome ,Genetics ,Human Genome ,Generic health relevance ,General Science & Technology - Abstract
The transcriptome is the readout of the genome. Identifying common features in it across distant species can reveal fundamental principles. To this end, the ENCODE and modENCODE consortia have generated large amounts of matched RNA-sequencing data for human, worm and fly. Uniform processing and comprehensive annotation of these data allow comparison across metazoan phyla, extending beyond earlier within-phylum transcriptome comparisons and revealing ancient, conserved features. Specifically, we discover co-expression modules shared across animals, many of which are enriched in developmental genes. Moreover, we use expression patterns to align the stages in worm and fly development and find a novel pairing between worm embryo and fly pupae, in addition to the embryo-to-embryo and larvae-to-larvae pairings. Furthermore, we find that the extent of non-canonical, non-coding transcription is similar in each organism, per base pair. Finally, we find in all three organisms that the gene-expression levels, both coding and non-coding, can be quantitatively predicted from chromatin features at the promoter using a 'universal model' based on a single set of organism-independent parameters.
- Published
- 2014
43. Transcriptional landscape of the prenatal human brain.
- Author
-
Miller, Jeremy A, Ding, Song-Lin, Sunkin, Susan M, Smith, Kimberly A, Ng, Lydia, Szafer, Aaron, Ebbert, Amanda, Riley, Zackery L, Royall, Joshua J, Aiona, Kaylynn, Arnold, James M, Bennet, Crissa, Bertagnolli, Darren, Brouner, Krissy, Butler, Stephanie, Caldejon, Shiella, Carey, Anita, Cuhaciyan, Christine, Dalley, Rachel A, Dee, Nick, Dolbeare, Tim A, Facer, Benjamin AC, Feng, David, Fliss, Tim P, Gee, Garrett, Goldy, Jeff, Gourley, Lindsey, Gregor, Benjamin W, Gu, Guangyu, Howard, Robert E, Jochim, Jayson M, Kuan, Chihchau L, Lau, Christopher, Lee, Chang-Kyu, Lee, Felix, Lemon, Tracy A, Lesnar, Phil, McMurray, Bergen, Mastan, Naveed, Mosqueda, Nerick, Naluai-Cecchini, Theresa, Ngo, Nhan-Kiet, Nyhus, Julie, Oldre, Aaron, Olson, Eric, Parente, Jody, Parker, Patrick D, Parry, Sheana E, Stevens, Allison, Pletikos, Mihovil, Reding, Melissa, Roll, Kate, Sandman, David, Sarreal, Melaine, Shapouri, Sheila, Shapovalova, Nadiya V, Shen, Elaine H, Sjoquist, Nathan, Slaughterbeck, Clifford R, Smith, Michael, Sodt, Andy J, Williams, Derric, Zöllei, Lilla, Fischl, Bruce, Gerstein, Mark B, Geschwind, Daniel H, Glass, Ian A, Hawrylycz, Michael J, Hevner, Robert F, Huang, Hao, Jones, Allan R, Knowles, James A, Levitt, Pat, Phillips, John W, Sestan, Nenad, Wohnoutka, Paul, Dang, Chinh, Bernard, Amy, Hohmann, John G, and Lein, Ed S
- Subjects
Brain ,Neocortex ,Fetus ,Animals ,Humans ,Mice ,Anatomy ,Artistic ,Species Specificity ,Gene Expression Regulation ,Developmental ,Conserved Sequence ,Gene Regulatory Networks ,Atlases as Topic ,Transcriptome ,Anatomy ,Artistic ,Gene Expression Regulation ,Developmental ,General Science & Technology - Abstract
The anatomical and functional architecture of the human brain is mainly determined by prenatal transcriptional processes. We describe an anatomically comprehensive atlas of the mid-gestational human brain, including de novo reference atlases, in situ hybridization, ultra-high-resolution magnetic resonance imaging (MRI) and microarray analysis on highly discrete laser-microdissected brain regions. In developing cerebral cortex, transcriptional differences are found between different proliferative and post-mitotic layers, wherein laminar signatures reflect cellular composition and developmental processes. Cytoarchitectural differences between human and mouse have molecular correlates, including species differences in gene expression in subplate, although surprisingly we find minimal differences between the inner and outer subventricular zones even though the outer zone is expanded in humans. Both germinal and post-mitotic cortical layers exhibit fronto-temporal gradients, with particular enrichment in the frontal lobe. Finally, many neurodevelopmental disorder and human-evolution-related genes show patterned expression, potentially underlying unique features of human cortical formation. These data provide a rich, freely-accessible resource for understanding human brain development.
- Published
- 2014
44. Leveraging a large language model to predict protein phase transition: A physical, multiscale, and interpretable approach.
- Author
-
Frank, Mor, Pengyu Ni, Jensen, Matthew, and Gerstein, Mark B.
- Subjects
LANGUAGE models ,PROTEIN structure prediction ,PHASE transitions ,ALZHEIMER'S disease ,PHASE separation - Abstract
Protein phase transitions (PPTs) from the soluble state to a dense liquid phase (forming droplets via liquid-liquid phase separation) or to solid aggregates (such as amyloids) play key roles in pathological processes associated with age-related diseases such as Alzheimer's disease. Several computational frameworks are capable of separately predicting the formation of droplets or amyloid aggregates based on protein sequences, yet none have tackled the prediction of both within a unified framework. Recently, large language models (LLMs) have exhibited great success in protein structure prediction; however, they have not yet been used for PPTs. Here, we fine-tune a LLM for predicting PPTs and demonstrate its usage in evaluating how sequence variants affect PPTs, an operation useful for protein design. In addition, we show its superior performance compared to suitable classical benchmarks. Due to the "black-box" nature of the LLM, we also employ a classical random forest model along with biophysical features to facilitate interpretation. Finally, focusing on Alzheimer's disease-related proteins, we demonstrate that greater aggregation is associated with reduced gene expression in Alzheimer's disease, suggesting a natural defense mechanism. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
45. MolLM: a unified language model for integrating biomedical text with 2D and 3D molecular representations.
- Author
-
Tang, Xiangru, Tran, Andrew, Tan, Jeffrey, and Gerstein, Mark B
- Subjects
UNIFIED modeling language ,TRANSFORMER models ,MOLECULAR structure ,TASK performance ,MOLECULES ,DEEP learning - Abstract
Motivation The current paradigm of deep learning models for the joint representation of molecules and text primarily relies on 1D or 2D molecular formats, neglecting significant 3D structural information that offers valuable physical insight. This narrow focus inhibits the models' versatility and adaptability across a wide range of modalities. Conversely, the limited research focusing on explicit 3D representation tends to overlook textual data within the biomedical domain. Results We present a unified pre-trained language model, MolLM, that concurrently captures 2D and 3D molecular information alongside biomedical text. MolLM consists of a text Transformer encoder and a molecular Transformer encoder, designed to encode both 2D and 3D molecular structures. To support MolLM's self-supervised pre-training, we constructed 160K molecule-text pairings. Employing contrastive learning as a supervisory signal for learning, MolLM demonstrates robust molecular representation capabilities across four downstream tasks, including cross-modal molecule and text matching, property prediction, captioning, and text-prompted molecular editing. Through ablation, we demonstrate that the inclusion of explicit 3D representations improves performance in these downstream tasks. Availability and implementation Our code, data, pre-trained model weights, and examples of using our model are all available at https://github.com/gersteinlab/MolLM. In particular, we provide Jupyter Notebooks offering step-by-step guidance on how to use MolLM to extract embeddings for both molecules and text. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
46. BioCoder: a benchmark for bioinformatics code generation with large language models.
- Author
-
Tang, Xiangru, Qian, Bill, Gao, Rick, Chen, Jiakang, Chen, Xinyun, and Gerstein, Mark B
- Subjects
LANGUAGE models ,GENERATIVE pre-trained transformers ,MODELS & modelmaking ,BIOINFORMATICS ,STEVEDORES - Abstract
Summary Pretrained large language models (LLMs) have significantly improved code generation. As these models scale up, there is an increasing need for the output to handle more intricate tasks and to be appropriately specialized to particular domains. Here, we target bioinformatics due to the amount of domain knowledge, algorithms, and data operations this discipline requires. We present BioCoder, a benchmark developed to evaluate LLMs in generating bioinformatics-specific code. BioCoder spans much of the field, covering cross-file dependencies, class declarations, and global variables. It incorporates 1026 Python functions and 1243 Java methods extracted from GitHub, along with 253 examples from the Rosalind Project, all pertaining to bioinformatics. Using topic modeling, we show that the overall coverage of the included code is representative of the full spectrum of bioinformatics calculations. BioCoder incorporates a fuzz-testing framework for evaluation. We have applied it to evaluate various models including InCoder, CodeGen, CodeGen2, SantaCoder, StarCoder, StarCoder+, InstructCodeT5+, GPT-3.5, and GPT-4. Furthermore, we fine-tuned one model (StarCoder), demonstrating that our training dataset can enhance the performance on our testing benchmark (by >15% in terms of Pass@K under certain prompt configurations and always >3%). The results highlight two key aspects of successful models: (i) Successful models accommodate a long prompt (>2600 tokens) with full context, including functional dependencies. (ii) They contain domain-specific knowledge of bioinformatics, beyond just general coding capability. This is evident from the performance gain of GPT-3.5/4 compared to the smaller models on our benchmark (50% versus up to 25%). Availability and implementation All datasets, benchmark, Docker images, and scripts required for testing are available at: https://github.com/gersteinlab/biocoder and https://biocoder-benchmark.github.io/. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
47. Assessing and mitigating privacy risks of sparse, noisy genotypes by local alignment to haplotype databases
- Author
-
Emani, Prashant S., primary, Geradi, Maya N., additional, Gürsoy, Gamze, additional, Grasty, Monica R., additional, Miranker, Andrew, additional, and Gerstein, Mark B., additional
- Published
- 2023
- Full Text
- View/download PDF
48. Comprehensive functional genomic resource and integrative model for the human brain
- Author
-
PsychENCODE Consortium, Wang, Daifeng, Liu, Shuang, Warrell, Jonathan, Won, Hyejung, Shi, Xu, Navarro, Fabio C. P., Clarke, Declan, Gu, Mengting, Emani, Prashant, Yang, Yucheng T., Xu, Min, Gandal, Michael J., Lou, Shaoke, Zhang, Jing, Park, Jonathan J., Yan, Chengfei, Rhie, Suhn Kyong, Manakongtreecheep, Kasidet, Zhou, Holly, Nathan, Aparna, Peters, Mette, Mattei, Eugenio, Fitzgerald, Dominic, Brunetti, Tonya, Moore, Jill, Jiang, Yan, Girdhar, Kiran, Hoffman, Gabriel E., Kalayci, Selim, Gümüş, Zeynep H., Crawford, Gregory E., Roussos, Panos, Akbarian, Schahram, Jaffe, Andrew E., P.White, Kevin, Weng, Zhiping, Sestan, Nenad, Geschwind, Daniel H., Knowles, James A., and Gerstein, Mark B.
- Published
- 2018
49. Integrative functional genomic analysis of human brain development and neuropsychiatric risks
- Author
-
BrainSpan Consortium, PsychENCODE Consortium, PsychENCODE Developmental Subgroup, Li, Mingfeng, Santpere, Gabriel, Kawasawa, Yuka Imamura, Evgrafov, Oleg V., Gulden, Forrest O., Pochareddy, Sirisha, Sunkin, Susan M., Li, Zhen, Shin, Yurae, Zhu, Ying, Sousa, André M. M., Werling, Donna M., Kitchen, Robert R., Kang, Hyo Jung, Pletikos, Mihovil, Choi, Jinmyung, Muchnik, Sydney, Xu, Xuming, Wang, Daifeng, Lorente-Galdos, Belen, Liu, Shuang, Giusti-Rodríguez, Paola, Won, Hyejung, de Leeuw, Christiaan A., Pardiñas, Antonio F., Hu, Ming, Jin, Fulai, Li, Yun, Owen, Michael J., O’Donovan, Michael C., Walters, James T. R., Posthuma, Danielle, Levitt, Pat, Weinberger, Daniel R., Hyde, Thomas M., Kleinman, Joel E., Geschwind, Daniel H., Hawrylycz, Michael J., State, Matthew W., Sanders, Stephan J., Sullivan, Patrick F., Gerstein, Mark B., Lein, Ed S., Knowles, James A., and Sestan, Nenad
- Published
- 2018
50. Copy Number Variants and Segmental Duplications Show Different Formation Signatures
- Author
-
Kim, Philip M., Korbel, Jan O., Chen, Xueying, and Gerstein, Mark B.
- Subjects
Quantitative Biology - Genomics ,Quantitative Biology - Quantitative Methods - Abstract
In addition to variation in terms of single nucleotide polymorphisms (SNPs), whole regions ranging from several kilobases up to a megabase in length differ in copy number among individuals. These differences are referred to as Copy Number Variants (CNVs) and extensive mapping of these is underway. Recent studies have highlighted their great prevalence in the human genome. Segmental Duplications (SDs) are long (>1kb) stretches of duplicated DNA with high sequence identity. First, we analyzed the co-localization of SDs and find that SDs are significantly co-localized with each other, resulting in a power-law distribution, which suggests a preferential attachment mechanism, i.e. existing SDs are likely to be involved in creating new ones nearby. Second, we look at the relationship of CNVs/SDs with various types of repeats. We we find that the previously recognized association of SDs with Alu elements is significantly stronger for older SDs and is sharply decreasing for younger ones. While it might be expected that the patterns should be similar for SDs and CNVs, we find, surprisingly, no association of CNVs with Alu elements. This trend is consistent with the decreasing correlation between Alu elements and younger SDs, the activity of Alu elements has been decreasing and by now it they seem no longer active. Furthermore, we find a striking association of SDs with processed pseudogenes suggesting that they may also have mediated SD formation. Moreover, find strong association with microsatellites for both SDs and CNVs that suggests a role for satellites in the formation of both., Comment: 13 pages
- Published
- 2007
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.