33 results on '"Gerstein, Mark B"'
Search Results
2. The cell-type underpinnings of the human functional cortical connectome
- Author
-
Anderson, Kevin M., Chopra, Sidhant, Dhamala, Elvisha, Emani, Prashant S., Gerstein, Mark B., Margulies, Daniel S., and Holmes, Avram J.
- Abstract
The functional properties of the human brain arise, in part, from the vast assortment of cell types that pattern the cerebral cortex. The cortical sheet can be broadly divided into distinct networks, which are embedded into processing streams, or gradients, that extend from unimodal systems through higher-order association territories. Here using microarray data from the Allen Human Brain Atlas and single-nucleus RNA-sequencing data from multiple cortical territories, we demonstrate that cell-type distributions are spatially coupled to the functional organization of cortex, as estimated through functional magnetic resonance imaging. Differentially enriched cells follow the spatial topography of both functional gradients and associated large-scale networks. Distinct cellular fingerprints were evident across networks, and a classifier trained on postmortem cell-type distributions was able to predict the functional network allegiance of cortical tissue samples. These data indicate that the in vivo organization of the cortical sheet is reflected in the spatial variability of its cellular composition.
- Published
- 2025
- Full Text
- View/download PDF
3. An integrative TAD catalog in lymphoblastoid cell lines discloses the functional impact of deletions and insertions in human genomes
- Author
-
Li, Chong, Bonder, Marc Jan, Syed, Sabriya, Jensen, Matthew, Gerstein, Mark B., Zody, Michael C., Chaisson, Mark J.P., Talkowski, Michael E., Marschall, Tobias, Korbel, Jan O., Eichler, Evan E., Lee, Charles, and Shi, Xinghua
- Abstract
The human genome is packaged within a three-dimensional (3D) nucleus and organized into structural units known as compartments, topologically associating domains (TADs), and loops. TAD boundaries, separating adjacent TADs, have been found to be well conserved across mammalian species and more evolutionarily constrained than TADs themselves. Recent studies show that structural variants (SVs) can modify 3D genomes through the disruption of TADs, which play an essential role in insulating genes from outside regulatory elements’ aberrant regulation. However, how SV affects the 3D genome structure and their association among different aspects of gene regulation and candidate cis-regulatory elements (cCREs) have rarely been studied systematically. Here, we assess the impact of SVs intersecting with TAD boundaries by developing an integrative Hi-C analysis pipeline, which enables the generation of an in-depth catalog of TADs and TAD boundaries in human lymphoblastoid cell lines (LCLs) to fill the gap of limited resources. Our catalog contains 18,865 TADs, including 4596 sub-TADs, with 185 SVs (TAD–SVs) that alter chromatin architecture. By leveraging the ENCODE registry of cCREs in humans, we determine that 34 of 185 TAD–SVs intersect with cCREs and observe significant enrichment of TAD–SVs within cCREs. This study provides a database of TADs and TAD–SVs in the human genome that will facilitate future investigations of the impact of SVs on chromatin structure and gene regulation in health and disease.
- Published
- 2024
- Full Text
- View/download PDF
4. Assessing and mitigating privacy risks of sparse, noisy genotypes by local alignment to haplotype databases
- Author
-
Emani, Prashant S., Geradi, Maya N., Gu¨rsoy, Gamze, Grasty, Monica R., Miranker, Andrew, and Gerstein, Mark B.
- Abstract
Single nucleotide polymorphisms (SNPs) from omics data create a reidentification risk for individuals and their relatives. Although the ability of thousands of SNPs (especially rare ones) to identify individuals has been repeatedly shown, the availability of small sets of noisy genotypes, from environmental DNA samples or functional genomics data, motivated us to quantify their informativeness. We present a computational tool suite, termed Privacy Leakage by Inference across Genotypic HMM Trajectories (PLIGHT), using population-genetics-based hidden Markov models (HMMs) of recombination and mutation to find piecewise alignment of small, noisy SNP sets to reference haplotype databases. We explore cases in which query individuals are either known to be in the database, or not, and consider several genotype queries, including those from environmental sample swabs from known individuals and from simulated “mosaics” (two-individual composites). Using PLIGHT on a database with ∼5000 haplotypes, we find for common, noise-free SNPs that only ten are sufficient to identify individuals, ∼20 can identify both components in two-individual mosaics, and 20–30 can identify first-order relatives. Using noisy environmental-sample-derived SNPs, PLIGHT identifies individuals in a database using ∼30 SNPs. Even when the individuals are not in the database, local genotype matches allow for some phenotypic information leakage based on coarse-grained SNP imputation. Finally, by quantifying privacy leakage from sparse SNP sets, PLIGHT helps determine the value of selectively sanitizing released SNPs without explicit assumptions about population membership or allele frequency. To make this practical, we provide a sanitization tool to remove the most identifying SNPs from genomic data.
- Published
- 2023
- Full Text
- View/download PDF
5. Functional genomics data: privacy risk assessment and technological mitigation
- Author
-
Gürsoy, Gamze, Li, Tianxiao, Liu, Susanna, Ni, Eric, Brannon, Charlotte M., and Gerstein, Mark B.
- Abstract
The generation of functional genomics data by next-generation sequencing has increased greatly in the past decade. Broad sharing of these data is essential for research advancement but poses notable privacy challenges, some of which are analogous to those that occur when sharing genetic variant data. However, there are also unique privacy challenges that arise from cryptic information leakage during the processing and summarization of functional genomics data from raw reads to derived quantities, such as gene expression values. Here, we review these challenges and present potential solutions for mitigating privacy risks while allowing broad data dissemination and analysis.
- Published
- 2022
- Full Text
- View/download PDF
6. Mako: A Graph-Based Pattern Growth Approach to Detect Complex Structural Variants
- Author
-
Lin, Jiadong, Yang, Xiaofei, Kosters, Walter, Xu, Tun, Jia, Yanyan, Wang, Songbo, Zhu, Qihui, Ryan, Mallory, Guo, Li, Gerstein, Mark B., Sanders, Ashley D., Zody, Micheal C., Talkowski, Michael E., Mills, Ryan E., Korbel, Jan O., Marschall, Tobias, Ebert, Peter, Audano, Peter A., Rodriguez-Martin, Bernardo, Porubsky, David, Jan Bonder, Marc, Sulovari, Arvis, Ebler, Jana, Zhou, Weichen, Serra Mari, Rebecca, Yilmaz, Feyza, Zhao, Xuefang, Hsieh, PingHsun, Lee, Joyce, Kumar, Sushant, Rausch, Tobias, Chen, Yu, Chong, Zechen, Munson, Katherine M., Chaisson, Mark J.P., Chen, Junjie, Shi, Xinghua, Wenger, Aaron M., Harvey, William T., Hansenfeld, Patrick, Regier, Allison, Hall, Ira M., Flicek, Paul, Hastie, Alex R., Fairely, Susan, Zhang, Chengsheng, Lee, Charles, Devine, Scott E., Eichler, Evan E., and Ye, Kai
- Abstract
Complex structural variants(CSVs) are genomic alterations that have more than two breakpoints and are considered as the simultaneous occurrence of simple structural variants. However, detecting the compounded mutational signals of CSVs is challenging through a commonly used model-match strategy. As a result, there has been limited progress for CSV discovery compared with simple structural variants. Here, we systematically analyzed the multi-breakpoint connection feature of CSVs, and proposed Mako, utilizing a bottom-up guided model-free strategy, to detect CSVs from paired-end short-read sequencing. Specifically, we implemented a graph-based pattern growthapproach, where the graph depicts potential breakpoint connections, and pattern growth enables CSV detection without pre-defined models. Comprehensive evaluations on both simulated and real datasets revealed that Mako outperformed other algorithms. Notably, validation rates of CSVs on real data based on experimental and computational validations as well as manual inspections are around 70%, where the medians of experimental and computational breakpoint shift are 13 bp and 26 bp, respectively. Moreover, the Mako CSV subgraph effectively characterized the breakpoint connections of a CSV event and uncovered a total of 15 CSV types, including two novel types of adjacent segment swap and tandem dispersed duplication. Further analysis of these CSVs also revealed the impact of sequence homology on the formation of CSVs. Mako is publicly available at https://github.com/xjtu-omics/Mako.
- Published
- 2022
- Full Text
- View/download PDF
7. Quantum computing at the frontiers of biological sciences
- Author
-
Emani, Prashant S., Warrell, Jonathan, Anticevic, Alan, Bekiranov, Stefan, Gandal, Michael, McConnell, Michael J., Sapiro, Guillermo, Aspuru-Guzik, Alán, Baker, Justin T., Bastiani, Matteo, Murray, John D., Sotiropoulos, Stamatios N., Taylor, Jacob, Senthil, Geetha, Lehner, Thomas, Gerstein, Mark B., and Harrow, Aram W.
- Abstract
Computing plays a critical role in the biological sciences but faces increasing challenges of scale and complexity. Quantum computing, a computational paradigm exploiting the unique properties of quantum mechanical analogs of classical bits, seeks to address many of these challenges. We discuss the potential for quantum computing to aid in the merging of insights across different areas of biological sciences.
- Published
- 2021
- Full Text
- View/download PDF
8. Expanded encyclopaedias of DNA elements in the human and mouse genomes
- Author
-
Moore, Jill E., Purcaro, Michael J., Pratt, Henry E., Epstein, Charles B., Shoresh, Noam, Adrian, Jessika, Kawli, Trupti, Davis, Carrie A., Dobin, Alexander, Kaul, Rajinder, Halow, Jessica, Van Nostrand, Eric L., Freese, Peter, Gorkin, David U., Shen, Yin, He, Yupeng, Mackiewicz, Mark, Pauli-Behn, Florencia, Williams, Brian A., Mortazavi, Ali, Keller, Cheryl A., Zhang, Xiao-Ou, Elhajjajy, Shaimae I., Huey, Jack, Dickel, Diane E., Snetkova, Valentina, Wei, Xintao, Wang, Xiaofeng, Rivera-Mulia, Juan Carlos, Rozowsky, Joel, Zhang, Jing, Chhetri, Surya B., Zhang, Jialing, Victorsen, Alec, White, Kevin P., Visel, Axel, Yeo, Gene W., Burge, Christopher B., Lécuyer, Eric, Gilbert, David M., Dekker, Job, Rinn, John, Mendenhall, Eric M., Ecker, Joseph R., Kellis, Manolis, Klein, Robert J., Noble, William S., Kundaje, Anshul, Guigó, Roderic, Farnham, Peggy J., Cherry, J. Michael, Myers, Richard M., Ren, Bing, Graveley, Brenton R., Gerstein, Mark B., Pennacchio, Len A., Snyder, Michael P., Bernstein, Bradley E., Wold, Barbara, Hardison, Ross C., Gingeras, Thomas R., Stamatoyannopoulos, John A., and Weng, Zhiping
- Abstract
The human and mouse genomes contain instructions that specify RNAs and proteins and govern the timing, magnitude, and cellular context of their production. To better delineate these elements, phase III of the Encyclopedia of DNA Elements (ENCODE) Project has expanded analysis of the cell and tissue repertoires of RNA transcription, chromatin structure and modification, DNA methylation, chromatin looping, and occupancy by transcription factors and RNA-binding proteins. Here we summarize these efforts, which have produced 5,992 new experimental datasets, including systematic determinations across mouse fetal development. All data are available through the ENCODE data portal (https://www.encodeproject.org), including phase II ENCODE1and Roadmap Epigenomics2data. We have developed a registry of 926,535 human and 339,815 mouse candidate cis-regulatory elements, covering 7.9 and 3.4% of their respective genomes, by integrating selected datatypes associated with gene regulation, and constructed a web-based server (SCREEN; http://screen.encodeproject.org) to provide flexible, user-defined access to this resource. Collectively, the ENCODE data and registry provide an expansive resource for the scientific community to build a better understanding of the organization and function of the human and mouse genomes.
- Published
- 2020
- Full Text
- View/download PDF
9. Perspectives on ENCODE
- Author
-
Snyder, Michael P., Gingeras, Thomas R., Moore, Jill E., Weng, Zhiping, Gerstein, Mark B., Ren, Bing, Hardison, Ross C., Stamatoyannopoulos, John A., Graveley, Brenton R., Feingold, Elise A., Pazin, Michael J., Pagan, Michael, Gilchrist, Daniel A., Hitz, Benjamin C., Cherry, J. Michael, Bernstein, Bradley E., Mendenhall, Eric M., Zerbino, Daniel R., Frankish, Adam, Flicek, Paul, and Myers, Richard M.
- Abstract
The Encylopedia of DNA Elements (ENCODE) Project launched in 2003 with the long-term goal of developing a comprehensive map of functional elements in the human genome. These included genes, biochemical regions associated with gene regulation (for example, transcription factor binding sites, open chromatin, and histone marks) and transcript isoforms. The marks serve as sites for candidate cis-regulatory elements (cCREs) that may serve functional roles in regulating gene expression1. The project has been extended to model organisms, particularly the mouse. In the third phase of ENCODE, nearly a million and more than 300,000 cCRE annotations have been generated for human and mouse, respectively, and these have provided a valuable resource for the scientific community.
- Published
- 2020
- Full Text
- View/download PDF
10. Latent evolutionary signatures: a general framework for analysing music and cultural evolution
- Author
-
Warrell, Jonathan, Salichos, Leonidas, Gancz, Michael, and Gerstein, Mark B.
- Abstract
Cultural processes of change bear many resemblances to biological evolution. The underlying units of non-biological evolution have, however, remained elusive, especially in the domain of music. Here, we introduce a general framework to jointly identify underlying units and their associated evolutionary processes. We model musical styles and principles of organization in dimensions such as harmony and form as following an evolutionary process. Furthermore, we propose that such processes can be identified by extracting latent evolutionary signatures from musical corpora, analogously to identifying mutational signatures in genomics. These signatures provide a latent embedding for each song or musical piece. We develop a deep generative architecture for our model, which can be viewed as a type of variational autoencoder with an evolutionary prior constraining the latent space; specifically, the embeddings for each song are tied together via an energy-based prior, which encourages songs close in evolutionary space to share similar representations. As illustration, we analyse songs from the McGill Billboard dataset. We find frequent chord transitions and formal repetition schemes and identify latent evolutionary signatures related to these features. Finally, we show that the latent evolutionary representations learned by our model outperform non-evolutionary representations in such tasks as period and genre prediction.
- Published
- 2024
- Full Text
- View/download PDF
11. Isoform-Level Interpretation of High-Throughput Proteomics Data Enabled by Deep Integration with RNA-seq.
- Author
-
Carlyle, Becky C., Kitchen, Robert R., Zhang, Jing, Wilson, Rashaun S., Lam, Tukiet T., Rozowsky, Joel S., Williams, Kenneth R., Sestan, Nenad, Gerstein, Mark B., and Nairn, Angus C.
- Published
- 2018
- Full Text
- View/download PDF
12. Isoform-Level Interpretation of High-Throughput Proteomics Data Enabled by Deep Integration with RNA-seq
- Author
-
Carlyle, Becky C., Kitchen, Robert R., Zhang, Jing, Wilson, Rashaun S., Lam, Tukiet T., Rozowsky, Joel S., Williams, Kenneth R., Sestan, Nenad, Gerstein, Mark B., and Nairn, Angus C.
- Abstract
Cellular control of gene expression is a complex process that is subject to multiple levels of regulation, but ultimately it is the protein produced that determines the biosynthetic state of the cell. One way that a cell can regulate the protein output from each gene is by expressing alternate isoforms with distinct amino acid sequences. These isoforms may exhibit differences in localization and binding interactions that can have profound functional implications. High-throughput liquid chromatography tandem mass spectrometry proteomics (LC–MS/MS) relies on enzymatic digestion and has lower coverage and sensitivity than transcriptomic profiling methods such as RNA-seq. Digestion results in predictable fragmentation of a protein, which can limit the generation of peptides capable of distinguishing between isoforms. Here we exploit transcript-level expression from RNA-seq to set prior likelihoods and enable protein isoform abundances to be directly estimated from LC–MS/MS, an approach derived from the principle that most genes appear to be expressed as a single dominant isoform in a given cell type or tissue. Through this deep integration of RNA-seq and LC–MS/MS data from the same sample, we show that a principal isoform can be identified in >80% of gene products in homogeneous HEK293 cell culture and >70% of proteins detected in complex human brain tissue. We demonstrate that the incorporation of translatome data from ribosome profiling further refines this process. Defining isoforms in experiments with matched RNA-seq/translatome and proteomic data increases the functional relevance of such data sets and will further broaden our understanding of multilevel control of gene expression.
- Published
- 2018
- Full Text
- View/download PDF
13. Dynamic RNA–protein interactions underlie the zebrafish maternal-to-zygotic transition
- Author
-
Despic, Vladimir, Dejung, Mario, Gu, Mengting, Krishnan, Jayanth, Zhang, Jing, Herzel, Lydia, Straube, Korinna, Gerstein, Mark B., Butter, Falk, and Neugebauer, Karla M.
- Abstract
During the maternal-to-zygotic transition (MZT), transcriptionally silent embryos rely on post-transcriptional regulation of maternal mRNAs until zygotic genome activation (ZGA). RNA-binding proteins (RBPs) are important regulators of post-transcriptional RNA processing events, yet their identities and functions during developmental transitions in vertebrates remain largely unexplored. Using mRNA interactome capture, we identified 227 RBPs in zebrafish embryos before and during ZGA, hereby named the zebrafish MZT mRNA-bound proteome. This protein constellation consists of many conserved RBPs, some of which are potential stage-specific mRNA interactors that likely reflect the dynamics of RNA–protein interactions during MZT. The enrichment of numerous splicing factors like hnRNP proteins before ZGA was surprising, because maternal mRNAs were found to be fully spliced. To address potentially unique roles of these RBPs in embryogenesis, we focused on Hnrnpa1. iCLIP and subsequent mRNA reporter assays revealed a function for Hnrnpa1 in the regulation of poly(A) tail length and translation of maternal mRNAs through sequence-specific association with 3′ UTRs before ZGA. Comparison of iCLIP data from two developmental stages revealed that Hnrnpa1 dissociates from maternal mRNAs at ZGA and instead regulates the nuclear processing of pri-mir-430transcripts, which we validated experimentally. The shift from cytoplasmic to nuclear RNA targets was accompanied by a dramatic translocation of Hnrnpa1 and other pre-mRNA splicing factors to the nucleus in a transcription-dependent manner. Thus, our study identifies global changes in RNA–protein interactions during vertebrate MZT and shows that Hnrnpa1 RNA-binding activities are spatially and temporally coordinated to regulate RNA metabolism during early development.
- Published
- 2017
- Full Text
- View/download PDF
14. The PsychENCODE project
- Author
-
Akbarian, Schahram, Liu, Chunyu, Knowles, James A, Vaccarino, Flora M, Farnham, Peggy J, Crawford, Gregory E, Jaffe, Andrew E, Pinto, Dalila, Dracheva, Stella, Geschwind, Daniel H, Mill, Jonathan, Nairn, Angus C, Abyzov, Alexej, Pochareddy, Sirisha, Prabhakar, Shyam, Weissman, Sherman, Sullivan, Patrick F, State, Matthew W, Weng, Zhiping, Peters, Mette A, White, Kevin P, Gerstein, Mark B, Amiri, Anahita, Armoskus, Chris, Ashley-Koch, Allison E, Bae, Taejeong, Beckel-Mitchener, Andrea, Berman, Benjamin P, Coetzee, Gerhard A, Coppola, Gianfilippo, Francoeur, Nancy, Fromer, Menachem, Gao, Robert, Grennan, Kay, Herstein, Jennifer, Kavanagh, David H, Ivanov, Nikolay A, Jiang, Yan, Kitchen, Robert R, Kozlenkov, Alexey, Kundakovic, Marija, Li, Mingfeng, Li, Zhen, Liu, Shuang, Mangravite, Lara M, Mattei, Eugenio, Markenscoff-Papadimitriou, Eirene, Navarro, Fábio C P, North, Nicole, Omberg, Larsson, Panchision, David, Parikshak, Neelroop, Poschmann, Jeremie, Price, Amanda J, Purcaro, Michael, Reddy, Timothy E, Roussos, Panos, Schreiner, Shannon, Scuderi, Soraya, Sebra, Robert, Shibata, Mikihito, Shieh, Annie W, Skarica, Mario, Sun, Wenjie, Swarup, Vivek, Thomas, Amber, Tsuji, Junko, van Bakel, Harm, Wang, Daifeng, Wang, Yongjun, Wang, Kai, Werling, Donna M, Willsey, A Jeremy, Witt, Heather, Won, Hyejung, Wong, Chloe C Y, Wray, Gregory A, Wu, Emily Y, Xu, Xuming, Yao, Lijing, Senthil, Geetha, Lehner, Thomas, Sklar, Pamela, and Sestan, Nenad
- Published
- 2015
- Full Text
- View/download PDF
15. Dynamic quality control machinery that operates across compartmental borders mediates the degradation of mammalian nuclear membrane proteins
- Author
-
Tsai, Pei-Ling, Cameron, Christopher J.F., Forni, Maria Fernanda, Wasko, Renee R., Naughton, Brigitte S., Horsley, Valerie, Gerstein, Mark B., and Schlieker, Christian
- Abstract
Many human diseases are caused by mutations in nuclear envelope (NE) proteins. How protein homeostasis and disease etiology are interconnected at the NE is poorly understood. Specifically, the identity of local ubiquitin ligases that facilitate ubiquitin-proteasome-dependent NE protein turnover is presently unknown. Here, we employ a short-lived, Lamin B receptor disease variant as a model substrate in a genetic screen to uncover key elements of NE protein turnover. We identify the ubiquitin-conjugating enzymes (E2s) Ube2G2 and Ube2D3, the membrane-resident ubiquitin ligases (E3s) RNF5 and HRD1, and the poorly understood protein TMEM33. RNF5, but not HRD1, requires TMEM33 both for efficient biosynthesis and function. Once synthesized, RNF5 responds dynamically to increased substrate levels at the NE by departing from the endoplasmic reticulum, where HRD1 remains confined. Thus, mammalian protein quality control machinery partitions between distinct cellular compartments to address locally changing substrate loads, establishing a robust cellular quality control system.
- Published
- 2022
- Full Text
- View/download PDF
16. Author Correction: Expanded encyclopaedias of DNA elements in the human and mouse genomes
- Author
-
Moore, Jill E., Purcaro, Michael J., Pratt, Henry E., Epstein, Charles B., Shoresh, Noam, Adrian, Jessika, Kawli, Trupti, Davis, Carrie A., Dobin, Alexander, Kaul, Rajinder, Halow, Jessica, Van Nostrand, Eric L., Freese, Peter, Gorkin, David U., Shen, Yin, He, Yupeng, Mackiewicz, Mark, Pauli-Behn, Florencia, Williams, Brian A., Mortazavi, Ali, Keller, Cheryl A., Zhang, Xiao-Ou, Elhajjajy, Shaimae I., Huey, Jack, Dickel, Diane E., Snetkova, Valentina, Wei, Xintao, Wang, Xiaofeng, Rivera-Mulia, Juan Carlos, Rozowsky, Joel, Zhang, Jing, Chhetri, Surya B., Zhang, Jialing, Victorsen, Alec, White, Kevin P., Visel, Axel, Yeo, Gene W., Burge, Christopher B., Lécuyer, Eric, Gilbert, David M., Dekker, Job, Rinn, John, Mendenhall, Eric M., Ecker, Joseph R., Kellis, Manolis, Klein, Robert J., Noble, William S., Kundaje, Anshul, Guigó, Roderic, Farnham, Peggy J., Cherry, J. Michael, Myers, Richard M., Ren, Bing, Graveley, Brenton R., Gerstein, Mark B., Pennacchio, Len A., Snyder, Michael P., Bernstein, Bradley E., Wold, Barbara, Hardison, Ross C., Gingeras, Thomas R., Stamatoyannopoulos, John A., and Weng, Zhiping
- Published
- 2022
- Full Text
- View/download PDF
17. Author Correction: Perspectives on ENCODE
- Author
-
Snyder, Michael P., Gingeras, Thomas R., Moore, Jill E., Weng, Zhiping, Gerstein, Mark B., Ren, Bing, Hardison, Ross C., Stamatoyannopoulos, John A., Graveley, Brenton R., Feingold, Elise A., Pazin, Michael J., Pagan, Michael, Gilchrist, Daniel A., Hitz, Benjamin C., Cherry, J. Michael, Bernstein, Bradley E., Mendenhall, Eric M., Zerbino, Daniel R., Frankish, Adam, Flicek, Paul, and Myers, Richard M.
- Published
- 2022
- Full Text
- View/download PDF
18. Author Correction: Functional genomics data: privacy risk assessment and technological mitigation
- Author
-
Gürsoy, Gamze, Li, Tianxiao, Liu, Susanna, Ni, Eric, Brannon, Charlotte M., and Gerstein, Mark B.
- Published
- 2022
- Full Text
- View/download PDF
19. A myelopoiesis-associated regulatory intergenic noncoding RNA transcript within the human HOXA cluster
- Author
-
Zhang, Xueqing, Lian, Zheng, Padden, Carolyn, Gerstein, Mark B., Rozowsky, Joel, Snyder, Michael, Gingeras, Thomas R., Kapranov, Philipp, Weissman, Sherman M., and Newburger, Peter E.
- Abstract
We have identified an intergenic transcriptional activity that is located between the human HOXA1 and HOXA2 genes, shows myeloid-specific expression, and is up-regulated during granulocytic differentiation. The novel gene, termed HOTAIRM1 (HOX antisense intergenic RNA myeloid 1), is transcribed antisense to the HOXA genes and originates from the same CpG island that embeds the start site of HOXA1. The transcript appears to be a noncoding RNA containing no long open-reading frame; sucrose gradient analysis shows no association with polyribosomal fractions. HOTAIRM1 is the most prominent intergenic transcript expressed and up-regulated during induced granulocytic differentiation of NB4 promyelocytic leukemia and normal human hematopoietic cells; its expression is specific to the myeloid lineage. Its induction during retinoic acid (RA)–driven granulocytic differentiation is through RA receptor and may depend on the expression of myeloid cell development factors targeted by RA signaling. Knockdown of HOTAIRM1 quantitatively blunted RA-induced expression of HOXA1 and HOXA4 during the myeloid differentiation of NB4 cells, and selectively attenuated induction of transcripts for the myeloid differentiation genes CD11b and CD18, but did not noticeably impact the more distal HOXA genes. These findings suggest that HOTAIRM1 plays a role in the myelopoiesis through modulation of gene expression in the HOXA cluster.
- Published
- 2009
- Full Text
- View/download PDF
20. A myelopoiesis-associated regulatory intergenic noncoding RNA transcript within the human HOXA cluster
- Author
-
Zhang, Xueqing, Lian, Zheng, Padden, Carolyn, Gerstein, Mark B., Rozowsky, Joel, Snyder, Michael, Gingeras, Thomas R., Kapranov, Philipp, Weissman, Sherman M., and Newburger, Peter E.
- Abstract
We have identified an intergenic transcriptional activity that is located between the human HOXA1and HOXA2genes, shows myeloid-specific expression, and is up-regulated during granulocytic differentiation. The novel gene, termed HOTAIRM1(HOX antisense intergenic RNA myeloid 1), is transcribed antisense to the HOXA genes and originates from the same CpG island that embeds the start site of HOXA1. The transcript appears to be a noncoding RNA containing no long open-reading frame; sucrose gradient analysis shows no association with polyribosomal fractions. HOTAIRM1 is the most prominent intergenic transcript expressed and up-regulated during induced granulocytic differentiation of NB4 promyelocytic leukemia and normal human hematopoietic cells; its expression is specific to the myeloid lineage. Its induction during retinoic acid (RA)–driven granulocytic differentiation is through RA receptor and may depend on the expression of myeloid cell development factors targeted by RA signaling. Knockdown of HOTAIRM1 quantitatively blunted RA-induced expression of HOXA1and HOXA4during the myeloid differentiation of NB4 cells, and selectively attenuated induction of transcripts for the myeloid differentiation genes CD11b and CD18, but did not noticeably impact the more distal HOXA genes. These findings suggest that HOTAIRM1 plays a role in the myelopoiesis through modulation of gene expression in the HOXAcluster.
- Published
- 2009
- Full Text
- View/download PDF
21. Targeting the Human Cancer Pathway Protein Interaction Network by Structural Genomics
- Author
-
Huang, Yuanpeng Janet, Hang, Dehua, Lu, Long Jason, Tong, Liang, Gerstein, Mark B., and Montelione, Gaetano T.
- Abstract
Structural genomics provides an important approach for characterizing and understanding systems biology. As a step toward better integrating protein three-dimensional (3D) structural information in cancer systems biology, we have constructed a Human Cancer Pathway Protein Interaction Network (HCPIN) by analysis of several classical cancer-associated signaling pathways and their physical protein-protein interactions. Many well known cancer-associated proteins play central roles as “hubs” or “bottlenecks” in the HCPIN. At least half of HCPIN proteins are either directly associated with or interact with multiple signaling pathways. Although some 45% of residues in these proteins are in sequence segments that meet criteria sufficient for approximate homology modeling (Basic Local Alignment Search Tool (BLAST) E-value <10−6), only ∼20% of residues in these proteins are structurally covered using high accuracy homology modeling criteria (i.e.BLAST E-value <10−6and at least 80% sequence identity) or by actual experimental structures. The HCPIN Website provides a comprehensive description of this biomedically important multipathway network together with experimental and homology models of HCPIN proteins useful for cancer biology research. To complement and enrich cancer systems biology, the Northeast Structural Genomics Consortium is targeting >1000 human proteins and protein domains from the HCPIN for sample production and 3D structure determination. The long range goal of this effort is to provide a comprehensive 3D structure-function database for human cancer-associated proteins and protein complexes in the context of their interaction networks. The network-based target selection (BioNet) approach described here is an example of a general strategy for targeting co-functioning proteins by structural genomics projects.
- Published
- 2008
- Full Text
- View/download PDF
22. Network propagation-based prioritization of long tail genes in 17 cancer types
- Author
-
Mohsen, Hussein, Gunasekharan, Vignesh, Qing, Tao, Seay, Montrell, Surovtseva, Yulia, Negahban, Sahand, Szallasi, Zoltan, Pusztai, Lajos, and Gerstein, Mark B.
- Abstract
Background: The diversity of genomic alterations in cancer poses challenges to fully understanding the etiologies of the disease. Recent interest in infrequent mutations, in genes that reside in the “long tail” of the mutational distribution, uncovered new genes with significant implications in cancer development. The study of cancer-relevant genes often requires integrative approaches pooling together multiple types of biological data. Network propagation methods demonstrate high efficacy in achieving this integration. Yet, the majority of these methods focus their assessment on detecting known cancer genes or identifying altered subnetworks. In this paper, we introduce a network propagation approach that entirely focuses on prioritizing long tail genes with potential functional impact on cancer development. Results: We identify sets of often overlooked, rarely to moderately mutated genes whose biological interactions significantly propel their mutation-frequency-based rank upwards during propagation in 17 cancer types. We call these sets “upward mobility genes” and hypothesize that their significant rank improvement indicates functional importance. We report new cancer-pathway associations based on upward mobility genes that are not previously identified using driver genes alone, validate their role in cancer cell survival in vitro using extensive genome-wide RNAi and CRISPR data repositories, and further conduct in vitro functional screenings resulting in the validation of 18 previously unreported genes. Conclusion: Our analysis extends the spectrum of cancer-relevant genes and identifies novel potential therapeutic targets.
- Published
- 2021
- Full Text
- View/download PDF
23. Mako: A Graph-based Pattern Growth Approach to Detect Complex Structural Variants
- Author
-
Lin, Jiadong, Yang, Xiaofei, Kosters, Walter, Xu, Tun, Jia, Yanyan, Wang, Songbo, Zhu, Qihui, Ryan, Mallory, Guo, Li, Zhang, Chengsheng, Gerstein, Mark B., Sanders, Ashley D., Zody, Micheal C., Talkowski, Michael E., Mills, Ryan E., Korbel, Jan O., Marschall, Tobias, Ebert, Peter, Audano, Peter A., Rodriguez-Martin, Bernardo, Porubsky, David, Jan Bonder, Marc, Sulovari, Arvis, Ebler, Jana, Zhou, Weichen, Serra Mari, Rebecca, Yilmaz, Feyza, Zhao, Xuefang, Hsieh, PingHsun, Lee, Joyce, Kumar, Sushant, Rausch, Tobias, Chen, Yu, Chong, Zechen, Munson, Katherine M., Chaisson, Mark J.P., Chen, Junjie, Shi, Xinghua, Wenger, Aaron M., Harvey, William T., Hansenfeld, Patrick, Regier, Allison, Hall, Ira M., Flicek, Paul, Hastie, Alex R., Fairely, Susan, Lee, Charles, Devine, Scott E., Eichler, Evan E., and Ye, Kai
- Abstract
Complex structural variants(CSVs) are genomic alterations that have more than two breakpoints and are considered as the simultaneous occurrence of simple structural variants. However, detecting the compounded mutational signals of CSVs is challenging through a commonly used model-match strategy. As a result, there has been limited progress for CSV discovery compared with simple structural variants. We systematically analyzed the multi-breakpoint connection feature of CSVs, and proposed Mako, utilizing a bottom-up guided model-free strategy, to detect CSVs from paired-end short-read sequencing. Specifically, we implemented a graph-based pattern growthapproach, where the graph depicts potential breakpoint connections, and pattern growth enables CSV detection without pre-defined models. Comprehensive evaluations on both simulated and real datasets revealed that Mako outperformed other algorithms. Notably, validation rates of CSV on real data based on experimental and computational validations as well as manual inspections are around 70%, where the medians of experimental and computational breakpoint shift are 13 bp and 26 bp, respectively. Moreover, the Mako CSV subgraph effectively characterized the breakpoint connections of a CSV event and uncovered a total of 15 CSV types, including two novel types of adjacent segments swap and tandem dispersed duplication. Further analysis of these CSVs also revealed the impact of sequence homology in the formation of CSVs. Mako is publicly available at https://github.com/xjtu-omics/Mako.
- Published
- 2021
- Full Text
- View/download PDF
24. SVFX: a machine learning framework to quantify the pathogenicity of structural variants
- Author
-
Kumar, Sushant, Harmanci, Arif, Vytheeswaran, Jagath, and Gerstein, Mark B.
- Abstract
There is a lack of approaches for identifying pathogenic genomic structural variants (SVs) although they play a crucial role in many diseases. We present a mechanism-agnostic machine learning-based workflow, called SVFX, to assign pathogenicity scores to somatic and germline SVs. In particular, we generate somatic and germline training models, which include genomic, epigenomic, and conservation-based features, for SV call sets in diseased and healthy individuals. We then apply SVFX to SVs in cancer and other diseases; SVFX achieves high accuracy in identifying pathogenic SVs. Predicted pathogenic SVs in cancer cohorts are enriched among known cancer genes and many cancer-related pathways.
- Published
- 2020
- Full Text
- View/download PDF
25. Dermal Adipocyte Lipolysis and Myofibroblast Conversion Are Required for Efficient Skin Repair
- Author
-
Shook, Brett A., Wasko, Renee R., Mano, Omer, Rutenberg-Schoenberg, Michael, Rudolph, Michael C., Zirak, Bahar, Rivera-Gonzalez, Guillermo C., López-Giráldez, Francesc, Zarini, Simona, Rezza, Amélie, Clark, Damon A., Rendl, Michael, Rosenblum, Michael D., Gerstein, Mark B., and Horsley, Valerie
- Abstract
Mature adipocytes store fatty acids and are a common component of tissue stroma. Adipocyte function in regulating bone marrow, skin, muscle, and mammary gland biology is emerging, but the role of adipocyte-derived lipids in tissue homeostasis and repair is poorly understood. Here, we identify an essential role for adipocyte lipolysis in regulating inflammation and repair after injury in skin. Genetic mouse studies revealed that dermal adipocytes are necessary to initiate inflammation after injury and promote subsequent repair. We find through histological, ultrastructural, lipidomic, and genetic experiments in mice that adipocytes adjacent to skin injury initiate lipid release necessary for macrophage inflammation. Tamoxifen-inducible genetic lineage tracing of mature adipocytes and single-cell RNA sequencing revealed that dermal adipocytes alter their fate and generate ECM-producing myofibroblasts within wounds. Thus, adipocytes regulate multiple aspects of repair and may be therapeutic for inflammatory diseases and defective wound healing associated with aging and diabetes.
- Published
- 2020
- Full Text
- View/download PDF
26. Gene names can confound most-searched listings
- Author
-
Gerstein, Mark B. and Navarro, Fabio C. P.
- Published
- 2018
- Full Text
- View/download PDF
27. [15] Extrapolating Traditional DNA Microarray Statistics to Tiling and Protein Microarray Technologies.
- Author
-
Royce, Thomas E., Rozowsky, Joel S., Luscombe, Nicholas M., Emanuelsson, Olof, Haiyuan Yu, Xiaowei Zhu, Snyder, Michael, and Gerstein, Mark B.
- Abstract
An abstract of the article "Extrapolating Traditional DNA Microarray Statistics to Tiling and Protein Microarray Technologies," by Thomas E. Royce and colleagues is presented.
- Published
- 2006
- Full Text
- View/download PDF
28. Identification of a Disease-Defining Gene Fusion in Epithelioid Hemangioendothelioma
- Author
-
Tanas, Munir R., Sboner, Andrea, Oliveira, Andre M., Erickson-Johnson, Michele R., Hespelt, Jessica, Hanwright, Philip J., Flanagan, John, Luo, Yuling, Fenwick, Kerry, Natrajan, Rachael, Mitsopoulos, Costas, Zvelebil, Marketa, Hoch, Benjamin L., Weiss, Sharon W., Debiec-Rychter, Maria, Sciot, Raf, West, Rob B., Lazar, Alexander J., Ashworth, Alan, Reis-Filho, Jorge S., Lord, Christopher J., Gerstein, Mark B., Rubin, Mark A., and Rubin, Brian P.
- Abstract
A newly identified gene fusion defines the vascular cancer epithelioid hemangioendothelioma and encodes a chimeric transcription factor.
- Published
- 2011
- Full Text
- View/download PDF
29. Small-World and Random Networks in Contact Maps of Protein Channels
- Author
-
Kotulska, Malgorzata and Gerstein, Mark B.
- Published
- 2011
- Full Text
- View/download PDF
30. Rewiring of Transcriptional Regulatory Networks: Hierarchy, Rather Than Connectivity, Better Reflects the Importance of Regulators
- Author
-
Bhardwaj, Nitin, Kim, Philip M., and Gerstein, Mark B.
- Abstract
Transcriptional regulatory networks appear to have an organization similar to a corporation, with top-level managers having the most influence.
- Published
- 2010
- Full Text
- View/download PDF
31. FusionSeq: a modular framework for finding gene fusions by analyzing paired-end RNA-sequencing data
- Author
-
Sboner, Andrea, Habegger, Lukas, Pflueger, Dorothee, Terry, Stephane, Chen, David Z, Rozowsky, Joel S, Tewari, Ashutosh K, Kitabayashi, Naoki, Moss, Benjamin J, Chee, Mark S, Demichelis, Francesca, Rubin, Mark A, and Gerstein, Mark B
- Abstract
We have developed FusionSeq to identify fusion transcripts from paired-end RNA-sequencing. FusionSeq includes filters to remove spurious candidate fusions with artifacts, such as misalignment or random pairing of transcript fragments, and it ranks candidates according to several statistics. It also has a module to identify exact sequences at breakpoint junctions. FusionSeq detected known and novel fusions in a specially sequenced calibration data set, including eight cancers with and without known rearrangements.
- Published
- 2010
- Full Text
- View/download PDF
32. Deciphering Protein Kinase Specificity Through Large-Scale Analysis of Yeast Phosphorylation Site Motifs
- Author
-
Mok, Janine, Kim, Philip M., Lam, Hugo Y. K., Piccirillo, Stacy, Zhou, Xiuqiong, Jeschke, Grace R., Sheridan, Douglas L., Parker, Sirlester A., Desai, Ved, Jwa, Miri, Cameroni, Elisabetta, Niu, Hengyao, Good, Matthew, Remenyi, Attila, Ma, Jia-Lin Nianhan, Sheu, Yi-Jun, Sassi, Holly E., Sopko, Richelle, Chan, Clarence S. M., De Virgilio, Claudio, Hollingsworth, Nancy M., Lim, Wendell A., Stern, David F., Stillman, Bruce, Andrews, Brenda J., Gerstein, Mark B., Snyder, Michael, and Turk, Benjamin E.
- Abstract
A high-throughput peptide array approach reveals insight into kinase substrates and specificity.
- Published
- 2010
- Full Text
- View/download PDF
33. Understanding Modularity in Molecular Networks Requires Dynamics
- Author
-
Alexander, Roger P., Kim, Philip M., Emonet, Thierry, and Gerstein, Mark B.
- Abstract
Relating structure and dynamics of molecular networks remains very challenging.
- Published
- 2009
- Full Text
- View/download PDF
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.