82 results on '"Yongjun Zhao"'
Search Results
2. A Scalable Strand-Specific Protocol Enabling Full-Length Total RNA Sequencing From Single Cells
- Author
-
Simon Haile, Richard D. Corbett, Veronique G. LeBlanc, Lisa Wei, Stephen Pleasance, Steve Bilobram, Ka Ming Nip, Kirstin Brown, Eva Trinh, Jillian Smith, Diane L. Trinh, Miruna Bala, Eric Chuah, Robin J. N. Coope, Richard A. Moore, Andrew J. Mungall, Karen L. Mungall, Yongjun Zhao, Martin Hirst, Samuel Aparicio, Inanc Birol, Steven J. M. Jones, and Marco A. Marra
- Subjects
full-length ,total RNA ,single-cell ,RNAseq ,cellenONE ,Genetics ,QH426-470 - Abstract
RNA sequencing (RNAseq) has been widely used to generate bulk gene expression measurements collected from pools of cells. Only relatively recently have single-cell RNAseq (scRNAseq) methods provided opportunities for gene expression analyses at the single-cell level, allowing researchers to study heterogeneous mixtures of cells at unprecedented resolution. Tumors tend to be composed of heterogeneous cellular mixtures and are frequently the subjects of such analyses. Extensive method developments have led to several protocols for scRNAseq but, owing to the small amounts of RNA in single cells, technical constraints have required compromises. For example, the majority of scRNAseq methods are limited to sequencing only the 3′ or 5′ termini of transcripts. Other protocols that facilitate full-length transcript profiling tend to capture only polyadenylated mRNAs and are generally limited to processing only 96 cells at a time. Here, we address these limitations and present a novel protocol that allows for the high-throughput sequencing of full-length, total RNA at single-cell resolution. We demonstrate that our method produced strand-specific sequencing data for both polyadenylated and non-polyadenylated transcripts, enabled the profiling of transcript regions beyond only transcript termini, and yielded data rich enough to allow identification of cell types from heterogeneous biological samples.
- Published
- 2021
- Full Text
- View/download PDF
3. Increasing quality, throughput and speed of sample preparation for strand-specific messenger RNA sequencing
- Author
-
Simon Haile, Richard D. Corbett, Tina MacLeod, Steve Bilobram, Duane Smailus, Philip Tsao, Heather Kirk, Helen McDonald, Pawan Pandoh, Miruna Bala, Martin Hirst, Diane Miller, Richard A. Moore, Andrew J. Mungall, Jacquie Schein, Robin J. Coope, Yussanne Ma, Yongjun Zhao, Rob A. Holt, Steven J. Jones, and Marco A. Marra
- Subjects
Ampure XP magnetic beads ,Next-generation sequencing ,Library construction ,Strand-specific ,dUTP ,Uracil DNA N-Glycosylase ,Biotechnology ,TP248.13-248.65 ,Genetics ,QH426-470 - Abstract
Abstract Background RNA-Sequencing (RNA-seq) is now commonly used to reveal quantitative spatiotemporal snapshots of the transcriptome, the structures of transcripts (splice variants and fusions) and landscapes of expressed mutations. However, standard approaches for library construction typically require relatively high amounts of input RNA, are labor intensive, and are time consuming. Methods Here, we report the outcome of a systematic effort to optimize and streamline steps in strand-specific RNA-seq library construction. Results This work has resulted in the identification of an optimized messenger RNA isolation protocol, a potent reverse transcriptase for cDNA synthesis, and an efficient chemistry and a simplified formulation of library construction reagents. We also present an optimization of bead-based purification and size selection designed to maximize the recovery of cDNA fragments. Conclusions These developments have allowed us to assemble a rapid high throughput pipeline that produces high quality data from amounts of total RNA as low as 25 ng. While the focus of this study is on RNA-seq sample preparation, some of these developments are also relevant to other next-generation sequencing library types.
- Published
- 2017
- Full Text
- View/download PDF
4. Spruce giga-genomes: structurally similar yet distinctive with differentially expanding gene families and rapidly evolving genes
- Author
-
Kristina K. Gagalova, René L. Warren, Lauren Coombe, Johnathan Wong, Ka Ming Nip, Macaire Man Saint Yuen, Justin G. A. Whitehill, Jose M. Celedon, Carol Ritland, Greg A. Taylor, Dean Cheng, Patrick Plettner, S. Austin Hammond, Hamid Mohamadi, Yongjun Zhao, Richard A. Moore, Andrew J. Mungall, Brian Boyle, Jérôme Laroche, Joan Cottrell, John J. Mackay, Manuel Lamothe, Sébastien Gérardi, Nathalie Isabel, Nathalie Pavy, Steven J. M. Jones, Joerg Bohlmann, Jean Bousquet, and Inanc Birol
- Subjects
Expressed Sequence Tags ,Tracheophyta ,Multigene Family ,Genetics ,Cell Biology ,Plant Science ,Picea ,Genome, Plant ,Phylogeny - Abstract
Spruces (Picea spp.) are coniferous trees widespread in boreal and mountainous forests of the northern hemisphere, with large economic significance and enormous contributions to global carbon sequestration. Spruces harbor very large genomes with high repetitiveness, hampering their comparative analysis. Here, we present and compare the genomes of four different North American spruces: the genome assemblies for Engelmann spruce (Picea engelmannii) and Sitka spruce (Picea sitchensis) together with improved and more contiguous genome assemblies for white spruce (Picea glauca) and for a naturally occurring introgress of these three species known as interior spruce (P. engelmannii × glauca × sitchensis). The genomes were structurally similar, and a large part of scaffolds could be anchored to a genetic map. The composition of the interior spruce genome indicated asymmetric contributions from the three ancestral genomes. Phylogenetic analysis of the nuclear and organelle genomes revealed a topology indicative of ancient reticulation. Different patterns of expansion of gene families among genomes were observed and related with presumed diversifying ecological adaptations. We identified rapidly evolving genes that harbored high rates of non-synonymous polymorphisms relative to synonymous ones, indicative of positive selection and its hitchhiking effects. These gene sets were mostly distinct between the genomes of ecologically contrasted species, and signatures of convergent balancing selection were detected. Stress and stimulus response was identified as the most frequent function assigned to expanding gene families and rapidly evolving genes. These two aspects of genomic evolution were complementary in their contribution to divergent evolution of presumed adaptive nature. These more contiguous spruce giga-genome sequences should strengthen our understanding of conifer genome structure and evolution, as their comparison offers clues into the genetic basis of adaptation and ecology of conifers at the genomic level. They will also provide tools to better monitor natural genetic diversity and improve the management of conifer forests. The genomes of four closely related North American spruces indicate that their high similarity at the morphological level is paralleled by the high conservation of their physical genome structure. Yet, the evidence of divergent evolution is apparent in their rapidly evolving genomes, supported by differential expansion of key gene families and large sets of genes under positive selection, largely in relation to stimulus and environmental stress response.
- Published
- 2022
5. Complete Mitochondrial Genome of a Gymnosperm, Sitka Spruce (Picea sitchensis), Indicates a Complex Physical Structure
- Author
-
Stephen Pleasance, Steven J.M. Jones, Yongjun Zhao, Inanc Birol, Tina MacLeod, Heather Kirk, Shaun D. Jackman, Jean Bousquet, René L. Warren, Lauren Coombe, Eva Trinh, Joerg Bohlmann, Pawan Pandoh, and Robin J.N. Coope
- Subjects
AcademicSubjects/SCI01140 ,0106 biological sciences ,Mitochondrial DNA ,gymnosperms ,organelle ,Sitka spruce ,Sequence assembly ,Biology ,01 natural sciences ,Genome ,ABySS ,03 medical and health sciences ,Gymnosperm ,Genetics ,Picea ,Ecology, Evolution, Behavior and Systematics ,030304 developmental biology ,0303 health sciences ,Molecular Structure ,Contig ,fungi ,AcademicSubjects/SCI01130 ,sequencing ,biology.organism_classification ,Genome Report ,Multipartite ,Evolutionary biology ,Minion ,Genome, Mitochondrial ,genome assembly ,Nanopore sequencing ,Genome, Plant ,010606 plant biology & botany - Abstract
Plant mitochondrial genomes vary widely in size. Although many plant mitochondrial genomes have been sequenced and assembled, the vast majority are of angiosperms, and few are of gymnosperms. Most plant mitochondrial genomes are smaller than a megabase, with a few notable exceptions. We have sequenced and assembled the complete 5.5-Mb mitochondrial genome of Sitka spruce (Picea sitchensis), to date, one of the largest mitochondrial genomes of a gymnosperm. We sequenced the whole genome using Oxford Nanopore MinION, and then identified contigs of mitochondrial origin assembled from these long reads based on sequence homology to the white spruce mitochondrial genome. The assembly graph shows a multipartite genome structure, composed of one smaller 168-kb circular segment of DNA, and a larger 5.4-Mb single component with a branching structure. The assembly graph gives insight into a putative complex physical genome structure, and its branching points may represent active sites of recombination.
- Published
- 2020
6. Complete Chloroplast Genome Sequence of a Black Spruce (Picea mariana) from Eastern Canada
- Author
-
Pawan Pandoh, Ashley M. ThomsonA.M. Thomson, Jean Bousquet, Heather Kirk, Andrew J. Mungall, Lauren Coombe, Richard A. Moore, Carol Ritland, Joerg Bohlmann, René L. Warren, Diana Lin, Inanc Birol, Theodora Lo, Steven J.M. Jones, and Yongjun Zhao
- Subjects
Whole genome sequencing ,0303 health sciences ,Future studies ,030302 biochemistry & molecular biology ,Taiga ,Evolutionary change ,Biology ,Black spruce ,Chloroplast ,03 medical and health sciences ,Immunology and Microbiology (miscellaneous) ,Botany ,Genetics ,Adaptation ,Molecular Biology ,030304 developmental biology ,Sequence (medicine) - Abstract
Here, we present the chloroplast genome sequence of black spruce ( Picea mariana ), a conifer widely distributed throughout North American boreal forests. This complete and annotated chloroplast sequence is 123,961 bp long and will contribute to future studies on the genetic basis of evolutionary change in spruce and adaptation in conifers.
- Published
- 2020
7. Complete Chloroplast Genome Sequence of an Engelmann Spruce (Picea engelmannii, Genotype Se404-851) from Western Canada
- Author
-
Lauren Coombe, S. Austin Hammond, Shaun D. Jackman, Carol Ritland, Heather Kirk, Trevor Doerksen, Barry Jaquish, Helen McDonald, Inanc Birol, Jean Bousquet, Richard A. Moore, Steven J.M. Jones, Andrew J. Mungall, Yongjun Zhao, Diana Lin, Kristina K. Gagalova, René L. Warren, Joerg Bohlmann, Pawan Pandoh, Canada's Michael Smith Genome Sciences Centre (CMSGSC), BC Cancer Agency (BCCRC), University of British Columbia (UBC), B.C. Ministry of Forest and Range, Kalamalka Forestry Center, British Columbia Ministry of Forests, Centre Hospitalier Régional Universitaire [Montpellier] (CHRU Montpellier), and Faculté de médecine de l'Université Laval [Québec] (ULaval)
- Subjects
Whole genome sequencing ,0303 health sciences ,biology ,Genomic research ,Engelmann spruce (Picea engelmannii) ,15. Life on land ,biology.organism_classification ,Genome ,Chloroplast ,Chloroplast genome sequence ,[SDV.GEN.GPL]Life Sciences [q-bio]/Genetics/Plants genetics ,03 medical and health sciences ,0302 clinical medicine ,Immunology and Microbiology (miscellaneous) ,Chloroplast DNA ,Picea engelmannii ,Evolutionary biology ,030220 oncology & carcinogenesis ,Genotype ,Genotype Se404-851 ,[SDE]Environmental Sciences ,Genetics ,Molecular Biology ,030304 developmental biology ,Sequence (medicine) - Abstract
Engelmann spruce ( Picea engelmannii ) is a conifer found primarily on the west coast of North America. Here, we present the complete chloroplast genome sequence of Picea engelmannii genotype Se404-851. This chloroplast sequence will benefit future conifer genomic research and contribute resources to further species conservation efforts.
- Published
- 2019
8. Complete Chloroplast Genome Sequence of a White Spruce (Picea glauca, Genotype WS77111) from Eastern Canada
- Author
-
Carol Ritland, Barry Jaquish, Yongjun Zhao, Jean Bousquet, Shaun D. Jackman, Andrew J. Mungall, Heather Kirk, Joerg Bohlmann, Diana Lin, Kristina K. Gagalova, Lauren Coombe, S. Austin Hammond, Nathalie Isabel, Pawan Pandoh, Steven J.M. Jones, Richard A. Moore, René L. Warren, and Inanc Birol
- Subjects
Whole genome sequencing ,0303 health sciences ,Phylogenetic tree ,030302 biochemistry & molecular biology ,Taiga ,fungi ,Genome Sequences ,Biology ,Genome ,White (mutation) ,03 medical and health sciences ,Immunology and Microbiology (miscellaneous) ,Genus ,Phylogenetics ,Botany ,Genotype ,Genetics ,Molecular Biology ,030304 developmental biology - Abstract
Here, we present the complete chloroplast genome sequence of white spruce (Picea glauca, genotype WS77111), a coniferous tree widespread in the boreal forests of North America. This sequence contributes to genomic and phylogenetic analyses of the Picea genus that are part of ongoing research to understand their adaptation to environmental stress.
- Published
- 2019
9. The Genome of the Steller Sea Lion (Eumetopias jubatus)
- Author
-
Pawan Pandoh, Shaun D. Jackman, Eric Chuah, Tina MacLeod, Kane Tse, Harwood H. Kwan, Steven J.M. Jones, Dean Cheng, Yongjun Zhao, Gregory A. Taylor, Inanc Birol, Heather Kirk, Martin Haulena, Sreeja Leelakumari, Richard D. Moore, Marco A. Marra, David A. S. Rosen, Luka Culibrk, Rebecca Carlsen, Ryan Tan, and Andrew J. Mungall
- Subjects
0301 basic medicine ,lcsh:QH426-470 ,Microfluidics ,Eumetopias jubatus ,Biology ,Genome ,Article ,DNA sequencing ,Nanopores ,03 medical and health sciences ,0302 clinical medicine ,microfluidic partitioning ,Genetics ,Animals ,nanopore ,Sea lion ,Gene ,genome ,Genetics (clinical) ,Steller sea lion ,Whole genome sequencing ,Genomic Library ,Whole Genome Sequencing ,Contig ,Accession number (bioinformatics) ,marine animal ,biology.organism_classification ,Sea Lions ,lcsh:Genetics ,030104 developmental biology ,Evolutionary biology ,030217 neurology & neurosurgery - Abstract
The Steller sea lion is the largest member of the Otariidae family and is found in the coastal waters of the northern Pacific Rim. Here, we present the Steller sea lion genome, determined through DNA sequencing approaches that utilized microfluidic partitioning library construction, as well as nanopore technologies. These methods constructed a highly contiguous assembly with a scaffold N50 length of over 14 megabases, a contig N50 length of over 242 kilobases and a total length of 2.404 gigabases. As a measure of completeness, 95.1% of 4104 highly conserved mammalian genes were found to be complete within the assembly. Further annotation identified 19,668 protein coding genes. The assembled genome sequence and underlying sequence data can be found at the National Center for Biotechnology Information (NCBI) under the BioProject accession number PRJNA475770.
- Published
- 2019
10. The Genome of the Beluga Whale (Delphinapterus leucas)
- Author
-
Kristina M. Miller, Armelle Troussard, Steven Bilobram, Amy M. Chan, Inanc Birol, Yussanne Ma, René L. Warren, Pawan Pandoh, Kane Tse, Caleb Choo, Irene Li, Adrian Ally, Daniel Paulino, Gideon J. Mordecai, Curtis A. Suttle, Steven J.M. Jones, Andrew J. Mungall, Heather Kirk, Richard A. Moore, Yongjun Zhao, Noreen Dhalla, Martin Haulena, S. Austin Hammond, Samantha J. Jones, Dorothy Cheung, Gregory A. Taylor, Karen Mungall, Simon K. Chan, Marco A. Marra, Angela D. Schulze, Angela K Y Tam, and Robin Coope
- Subjects
0301 basic medicine ,genome ,genome assembly ,beluga whale ,Delphinapterus leucas ,Cetacea ,lcsh:QH426-470 ,Sequence assembly ,Genome ,DNA sequencing ,Article ,Transcriptome ,03 medical and health sciences ,0302 clinical medicine ,Genetics ,Genetics (clinical) ,Leucas ,biology ,Accession number (bioinformatics) ,biology.organism_classification ,lcsh:Genetics ,030104 developmental biology ,Evolutionary biology ,Beluga Whale ,030217 neurology & neurosurgery - Abstract
The beluga whale is a cetacean that inhabits arctic and subarctic regions, and is the only living member of the genus Delphinapterus. The genome of the beluga whale was determined using DNA sequencing approaches that employed both microfluidic partitioning library and non-partitioned library construction. The former allowed for the construction of a highly contiguous assembly with a scaffold N50 length of over 19 Mbp and total reconstruction of 2.32 Gbp. To aid our understanding of the functional elements, transcriptome data was also derived from brain, duodenum, heart, lung, spleen, and liver tissue. Assembled sequence and all of the underlying sequence data are available at the National Center for Biotechnology Information (NCBI) under the Bioproject accession number PRJNA360851A.
- Published
- 2019
- Full Text
- View/download PDF
11. Amyloid β and tau are involved in sleep disorder in Alzheimer's disease by orexin A and adenosine A(1) receptor
- Author
-
Fumin Wang, Zhenhua Liu, Yongjun Zhao, Xiaoting Wang, and Minglu Tang
- Subjects
0301 basic medicine ,Male ,Sleep Wake Disorders ,medicine.medical_specialty ,Sleep, REM ,Mice, Transgenic ,tau Proteins ,03 medical and health sciences ,Orexin-A ,Adenosine A1 receptor ,0302 clinical medicine ,Downregulation and upregulation ,Alzheimer Disease ,Memory ,Internal medicine ,Cell Line, Tumor ,mental disorders ,Genetics ,medicine ,Animals ,Humans ,RNA, Messenger ,Phosphorylation ,RNA, Small Interfering ,Wakefulness ,Receptor ,Gene knockdown ,Sleep disorder ,Orexins ,Amyloid beta-Peptides ,Behavior, Animal ,Chemistry ,Receptor, Adenosine A1 ,Brain ,General Medicine ,medicine.disease ,Adenosine ,Peptide Fragments ,Orexin ,Up-Regulation ,Disease Models, Animal ,030104 developmental biology ,Endocrinology ,030217 neurology & neurosurgery ,medicine.drug - Abstract
Sleep disorder is confirmed as a core component of Alzheimer's disease (AD), while the accumulation of amyloid β (Aβ) in brain tissue is an important pathological feature of AD. However, how Aβ affects AD‑associated sleep disorder is not yet well understood. In the present study, experiments on animal and cell models were performed to detect the association between sleep disorder and Aβ. It was observed that Aβ25‑35 administration significantly decreased non‑rapid eye movement sleep, while it increased wakefulness in mice. In addition, reverse transcription‑quantitative polymerase chain reaction and western blot analysis revealed that the expression levels of tau, p‑tau, orexin A and orexin neurons express adenosine A1 receptor (A1R) were markedly upregulated in the brain tissue of AD mice compared with that in samples obtained from control mice. Furthermore, the in vitro study revealed that the expression levels of tau, p‑tau, orexin A and adenosine A1R were also significantly increased in human neuroblastoma SH‑SY5Y cells treated with Aβ25‑35 as compared with the control cells. In addition, the tau inhibitor TRx 0237 significantly reversed the promoting effects of Aβ25‑35 on tau, p‑tau, orexin A and adenosine A1R expression levels, and adenosine A1R or orexin A knockdown also inhibited tau and p‑tau expression levels mediated by Aβ25‑35 in AD. These results indicate that Aβ and tau may be considered as novel biomarkers of sleep disorder in AD pathology, and that they function by regulating the expression levels of orexin A and adenosine A1R.
- Published
- 2018
12. The Genome of the Northern Sea Otter (Enhydra lutris kenyoni)
- Author
-
Yongjun Zhao, Richard A. Moore, Angela K Y Tam, Martin Haulena, Simon K. Chan, René L. Warren, Yussanne Ma, S. Austin Hammond, Marco A. Marra, Gregory A. Taylor, Noreen Dhalla, Inanc Birol, Armelle Troussard, Robin Coope, Heather Kirk, Samantha J. Jones, Andrew J. Mungall, Steven Bilobram, Steven J.M. Jones, Adrian Ally, Daniel Paulino, Karen Mungall, Pawan Pandoh, and Caleb Choo
- Subjects
0106 biological sciences ,0301 basic medicine ,lcsh:QH426-470 ,Mustelidae ,Sequence assembly ,010603 evolutionary biology ,01 natural sciences ,Genome ,Article ,Otter ,DNA sequencing ,03 medical and health sciences ,genome ,genome assembly ,northern sea otter ,Enhyrda lutris kenyoni ,biology.animal ,parasitic diseases ,Genetics ,Genetics (clinical) ,Whole genome sequencing ,biology ,Enhydra lutris ,Accession number (bioinformatics) ,biology.organism_classification ,lcsh:Genetics ,030104 developmental biology ,Evolutionary biology - Abstract
The northern sea otter inhabits coastal waters of the northern Pacific Ocean and is the largest member of the Mustelidae family. DNA sequencing methods that utilize microfluidic partitioned and non-partitioned library construction were used to establish the sea otter genome. The final assembly provided 2.426 Gbp of highly contiguous assembled genomic sequences with a scaffold N50 length of over 38 Mbp. We generated transcriptome data derived from a lymphoma to aid in the determination of functional elements. The assembled genome sequence and underlying sequence data are available at the National Center for Biotechnology Information (NCBI) under the BioProject accession number PRJNA388419.
- Published
- 2017
- Full Text
- View/download PDF
13. The North American bullfrog draft genome provides insight into hormonal regulation of long noncoding RNA
- Author
-
Erdi Kucuk, Sara Ohora, Branden V. Walle, Yongjun Zhao, Heather Kirk, René L. Warren, Robert A. Holt, Pawan Pandoh, Caren C. Helbing, Stephen Pleasance, Inanc Birol, Ewan A. Gibb, Martin Jones, Benjamin P. Vandervalk, Andrew J. Mungall, Jessica M. Round, Hamza Khan, S. Austin Hammond, Richard A. Moore, Nik Veldhoen, and Robin Coope
- Subjects
0301 basic medicine ,Amphibian ,Male ,Thyroid Hormones ,Science ,Population ,General Physics and Astronomy ,Genome ,General Biochemistry, Genetics and Molecular Biology ,Article ,03 medical and health sciences ,Bullfrog ,biology.animal ,Animals ,education ,lcsh:Science ,Gene ,Phylogeny ,Genetics ,True frogs ,education.field_of_study ,Multidisciplinary ,Rana catesbeiana ,biology ,Lithobates ,Computational Biology ,Molecular Sequence Annotation ,General Chemistry ,biology.organism_classification ,030104 developmental biology ,Threatened species ,Genome, Mitochondrial ,North America ,lcsh:Q ,RNA, Long Noncoding - Abstract
Frogs play important ecological roles, and several species are important model organisms for scientific research. The globally distributed Ranidae (true frogs) are the largest frog family, and have substantial evolutionary distance from the model laboratory Xenopus frog species. Unfortunately, there are currently no genomic resources for the former, important group of amphibians. More widely applicable amphibian genomic data is urgently needed as more than two-thirds of known species are currently threatened or are undergoing population declines. We report a 5.8 Gbp (NG50 = 69 kbp) genome assembly of a representative North American bullfrog (Rana [Lithobates] catesbeiana). The genome contains over 22,000 predicted protein-coding genes and 6,223 candidate long noncoding RNAs (lncRNAs). RNA-Seq experiments show thyroid hormone causes widespread transcriptional change among protein-coding and putative lncRNA genes. This initial bullfrog draft genome will serve as a key resource with broad utility including amphibian research, developmental biology, and environmental research., The globally-distributed Ranidae (true frogs) are the largest frog family. Here, Hammond et al. present a draft genome of the North American bullfrog, Rana (Lithobates) catesbeiana, as a foundation for future understanding of true frog genetics as amphibian species face difficult environmental challenges.
- Published
- 2017
14. Automated high throughput nucleic acid purification from formalin-fixed paraffin-embedded tissue samples for next generation sequence analysis
- Author
-
Robin Coope, Robert A. Holt, Pawan Pandoh, Andrew J. Mungall, Simon Haile, Marco A. Marra, Martin Jones, Martin Hirst, David W. Scott, Steve Bilobram, Philip Tsao, Christian Steidl, Helen McDonald, Richard Corbett, Yongjun Zhao, Yussanne Ma, Duane E Smailus, Tina MacLeod, Richard A. Moore, Miruna Bala, Diane Miller, Heather Kirk, Steven J.M. Jones, and Denise Brooks
- Subjects
0301 basic medicine ,cDNA libraries ,Tissue Fixation ,Computer science ,Molecular biology ,lcsh:Medicine ,medicine.disease_cause ,Biochemistry ,chemistry.chemical_compound ,Automation ,Database and Informatics Methods ,0302 clinical medicine ,Sequencing techniques ,DNA library construction ,DNA libraries ,lcsh:Science ,Throughput (business) ,Multidisciplinary ,Paraffin Embedding ,Nucleic acid methods ,High-Throughput Nucleotide Sequencing ,RNA sequencing ,Complementary DNA ,Genomic Library Construction ,3. Good health ,Nucleic acids ,Ribosomal RNA ,030220 oncology & carcinogenesis ,RNA extraction ,Sequence Analysis ,Research Article ,Cell biology ,Cellular structures and organelles ,Formalin fixed paraffin embedded ,Forms of DNA ,Bioinformatics ,Sequence alignment ,Computational biology ,DNA construction ,DNA sequencing ,03 medical and health sciences ,Extraction techniques ,Formaldehyde ,microRNA ,medicine ,Genetics ,Non-coding RNA ,Biology and life sciences ,cDNA library ,lcsh:R ,RNA ,DNA ,Gene regulation ,Research and analysis methods ,MicroRNAs ,030104 developmental biology ,Molecular biology techniques ,chemistry ,Nucleic acid ,lcsh:Q ,Gene expression ,Carcinogenesis ,Ribosomes ,Sequence Alignment - Abstract
Curation and storage of formalin-fixed, paraffin-embedded (FFPE) samples are standard procedures in hospital pathology laboratories around the world. Many thousands of such samples exist and could be used for next generation sequencing analysis. Retrospective analyses of such samples are important for identifying molecular correlates of carcinogenesis, treatment history and disease outcomes. Two major hurdles in using FFPE material for sequencing are the damaged nature of the nucleic acids and the labor-intensive nature of nucleic acid purification. These limitations and a number of other issues that span multiple steps from nucleic acid purification to library construction are addressed here. We optimized and automated a 96-well magnetic bead-based extraction protocol that can be scaled to large cohorts and is compatible with automation. Using sets of 32 and 91 individual FFPE samples respectively, we generated libraries from 100 ng of total RNA and DNA starting amounts with 95-100% success rate. The use of the resulting RNA in micro-RNA sequencing was also demonstrated. In addition to offering the potential of scalability and rapid throughput, the yield obtained with lower input requirements makes these methods applicable to clinical samples where tissue abundance is limiting.
- Published
- 2017
15. Clonal Analysis via Barcoding Reveals Diverse Growth and Differentiation of Transplanted Mouse and Human Mammary Stem Cells
- Author
-
Melanie D. Kardel, Michelle Moksa, Peter Eirew, Samuel Aparicio, Tomo Osako, Nagarajan Kannan, Kane Tse, Pawan Pandoh, Maisam Makarem, Long V. Nguyen, Annaick Carles, William Kennedy, Alice M.S. Cheung, R. Keith Humphries, Martin Hirst, Connie J. Eaves, Thomas Zeng, and Yongjun Zhao
- Subjects
Lineage (genetic) ,Cell Culture Techniques ,Biology ,Clonal analysis ,Mice ,Basal (phylogenetics) ,Mammary Glands, Animal ,Genetics ,Animals ,Humans ,Regeneration ,Cell Lineage ,Mammary Glands, Human ,Cell Proliferation ,Cell Size ,Stem Cells ,High-Throughput Nucleotide Sequencing ,Cell Differentiation ,Epithelial Cells ,Cell Biology ,Clone Cells ,Cell biology ,Immunology ,Molecular Medicine ,Female ,Stem cell ,Stem Cell Transplantation - Abstract
SummaryCellular barcoding offers a powerful approach to characterize the growth and differentiation activity of large numbers of cotransplanted stem cells. Here, we describe a lentiviral genomic-barcoding and analysis strategy and its use to compare the clonal outputs of transplants of purified mouse and human basal mammary epithelial cells. We found that both sources of transplanted cells produced many bilineage mammary epithelial clones in primary recipients, although primary clones containing only one detectable mammary lineage were also common. Interestingly, regardless of the species of origin, many clones evident in secondary recipients were not detected in the primary hosts, and others that were changed from appearing luminal-restricted to appearing bilineage. This barcoding methodology has thus revealed conservation between mice and humans of a previously unknown diversity in the growth and differentiation activities of their basal mammary epithelial cells stimulated to grow in transplanted hosts.
- Published
- 2014
16. Evaluation of protocols for rRNA depletion-based RNA sequencing of nanogram inputs of mammalian total RNA
- Author
-
Marco A. Marra, Tina MacLeod, Richard A. Moore, Richard Corbett, Pawan Pandoh, Yongjun Zhao, Ryan D. Morin, Bruno M. Grande, Miruna Bala, Heather Kirk, Steve Bilobram, Andrew J. Mungall, Steven J.M. Jones, Robin J.N. Coope, Karen Mungall, Helen McDonald, and Simon Haile
- Subjects
cDNA libraries ,Tissue Fixation ,Hydrolases ,Molecular biology ,Biochemistry ,Database and Informatics Methods ,Sequencing techniques ,0302 clinical medicine ,DNA libraries ,Energy-Producing Organelles ,Mammals ,0303 health sciences ,Multidisciplinary ,Messenger RNA ,High-Throughput Nucleotide Sequencing ,RNA sequencing ,Complementary DNA ,Enzymes ,Mitochondria ,Nucleic acids ,Ribosomal RNA ,RNA splicing ,Medicine ,Sequence Analysis ,Research Article ,Cell biology ,Cellular structures and organelles ,Nucleases ,Forms of DNA ,Bioinformatics ,Sequence analysis ,Science ,Sequence alignment ,Computational biology ,Bioenergetics ,Biology ,03 medical and health sciences ,Ribonucleases ,Extraction techniques ,DNA-binding proteins ,Genetics ,Animals ,Humans ,RNA, Messenger ,Nucleic acid structure ,Non-coding RNA ,030304 developmental biology ,Biology and life sciences ,Base Sequence ,Sequence Analysis, RNA ,cDNA library ,Gene Expression Profiling ,Proteins ,RNA ,DNA ,RNA extraction ,Research and analysis methods ,Molecular biology techniques ,RNA, Ribosomal ,Enzymology ,Transcriptome ,Ribosomes ,Sequence Alignment ,030217 neurology & neurosurgery - Abstract
Next generation RNA-sequencing (RNA-seq) is a flexible approach that can be applied to a range of applications including global quantification of transcript expression, the characterization of RNA structure such as splicing patterns and profiling of expressed mutations. Many RNA-seq protocols require up to microgram levels of total RNA input amounts to generate high quality data, and thus remain impractical for the limited starting material amounts typically obtained from rare cell populations, such as those from early developmental stages or from laser micro-dissected clinical samples. Here, we present an assessment of the contemporary ribosomal RNA depletion-based protocols, and identify those that are suitable for inputs as low as 1-10 ng of intact total RNA and 100-500 ng of partially degraded RNA from formalin-fixed paraffin-embedded tissues.
- Published
- 2019
17. The Genome of the North American Brown Bear or Grizzly: Ursus arctos ssp. horribilis
- Author
-
Dean Cheng, Lauren Coombe, Marco A. Marra, Pawan Pandoh, Shaun D. Jackman, Gregory A. Taylor, Andrew J. Mungall, Eric Chuah, Maria Franke, Heather Kirk, Kane Tse, Justin Chu, Christopher J. Dutton, Rebecca Carlsen, Steven J.M. Jones, Inanc Birol, Richard A. Moore, and Yongjun Zhao
- Subjects
0301 basic medicine ,lcsh:QH426-470 ,Population ,Ursus arctos ssp. horribilis ,030105 genetics & heredity ,Genome ,Article ,03 medical and health sciences ,Comparable size ,microfluidic partitioning ,Genetics ,nanopore ,Ursus ,education ,genome ,Gene ,Genetics (clinical) ,Protein coding ,Whole genome sequencing ,education.field_of_study ,biology ,grizzly bear ,biology.organism_classification ,lcsh:Genetics ,030104 developmental biology ,Evolutionary biology ,Ursus arctos ssp. Horribilis - Abstract
The grizzly bear (Ursus arctos ssp. horribilis) represents the largest population of brown bears in North America. Its genome was sequenced using a microfluidic partitioning library construction technique, and these data were supplemented with sequencing from a nanopore-based long read platform. The final assembly was 2.33 Gb with a scaffold N50 of 36.7 Mb, and the genome is of comparable size to that of its close relative the polar bear (2.30 Gb). An analysis using 4104 highly conserved mammalian genes indicated that 96.1% were found to be complete within the assembly. An automated annotation of the genome identified 19,848 protein coding genes. Our study shows that the combination of the two sequencing modalities that we used is sufficient for the construction of highly contiguous reference quality mammalian genomes. The assembled genome sequence and the supporting raw sequence reads are available from the NCBI (National Center for Biotechnology Information) under the bioproject identifier PRJNA493656, and the assembly described in this paper is version QXTK01000000.
- Published
- 2018
18. Sources of erroneous sequences and artifact chimeric reads in next generation sequencing of genomic DNA from formalin-fixed paraffin-embedded samples
- Author
-
Karen Novik, Richard Corbett, Simon Haile, Morgan H. Bye, Miruna Bala, Yussanne Ma, Steven J.M. Jones, Diane Miller, Eva Trinh, Pawan Pandoh, Helen McDonald, Tina MacLeod, Richard A. Moore, Heather Kirk, Yongjun Zhao, Andrew J. Mungall, Robin J.N. Coope, Steve Bilobram, Robert A. Holt, and Marco A. Marra
- Subjects
Hot Temperature ,Sequence analysis ,Genomics ,Artifact (software development) ,Computational biology ,Biology ,Genome ,DNA sequencing ,Fixatives ,03 medical and health sciences ,0302 clinical medicine ,Formaldehyde ,Genetics ,Animals ,Genomic library ,030304 developmental biology ,Genomic Library ,0303 health sciences ,Paraffin Embedding ,High-Throughput Nucleotide Sequencing ,Sequence Analysis, DNA ,Mice, Inbred C57BL ,genomic DNA ,Methods Online ,Artifacts ,030217 neurology & neurosurgery ,Reference genome - Abstract
Tissues used in pathology laboratories are typically stored in the form of formalin-fixed, paraffin-embedded (FFPE) samples. One important consideration in repurposing FFPE material for next generation sequencing (NGS) analysis is the sequencing artifacts that can arise from the significant damage to nucleic acids due to treatment with formalin, storage at room temperature and extraction. One such class of artifacts consists of chimeric reads that appear to be derived from non-contiguous portions of the genome. Here, we show that a major proportion of such chimeric reads align to both the ‘Watson’ and ‘Crick’ strands of the reference genome. We refer to these as strand-split artifact reads (SSARs). This study provides a conceptual framework for the mechanistic basis of the genesis of SSARs and other chimeric artifacts along with supporting experimental evidence, which have led to approaches to reduce the levels of such artifacts. We demonstrate that one of these approaches, involving S1 nuclease-mediated removal of single-stranded fragments and overhangs, also reduces sequence bias, base error rates, and false positive detection of copy number and single nucleotide variants. Finally, we describe an analytical approach for quantifying SSARs from NGS data.
- Published
- 2018
19. The shaping and functional consequences of the microRNA landscape in breast cancer
- Author
-
Carlos Caldas, Andrew R. Green, Elena Provenzano, Suet-Feung Chin, Eric A. Miska, Yongjun Zhao, Heidi Dvinge, Anna Git, Ian O. Ellis, Andrea Sottoriva, Gulisa Turashvili, Martin Hirst, Christina Curtis, Stefan Gräf, Javier Armisen, Samuel Aparicio, and Mali Salmon-Divon
- Subjects
DNA Copy Number Variations ,Population ,Breast Neoplasms ,Kaplan-Meier Estimate ,Computational biology ,Biology ,Genome ,Breast cancer ,microRNA ,medicine ,Humans ,RNA, Messenger ,RNA, Neoplasm ,education ,Proportional Hazards Models ,Regulation of gene expression ,Genetics ,education.field_of_study ,Multidisciplinary ,Genome, Human ,Gene Expression Profiling ,Cancer ,Prognosis ,medicine.disease ,Human genetics ,Gene Expression Regulation, Neoplastic ,Gene expression profiling ,MicroRNAs ,Female ,Algorithms ,Follow-Up Studies - Abstract
MicroRNAs (miRNAs) show differential expression across breast cancer subtypes, and have both oncogenic and tumour-suppressive roles. Here we report the miRNA expression profiles of 1,302 breast tumours with matching detailed clinical annotation, long-term follow-up and genomic and messenger RNA expression data. This provides a comprehensive overview of the quantity, distribution and variation of the miRNA population and provides information on the extent to which genomic, transcriptional and post-transcriptional events contribute to miRNA expression architecture, suggesting an important role for post-transcriptional regulation. The key clinical parameters and cellular pathways related to the miRNA landscape are characterized, revealing context-dependent interactions, for example with regards to cell adhesion and Wnt signalling. Notably, only prognostic miRNA signatures derived from breast tumours devoid of somatic copy-number aberrations (CNA-devoid) are consistently prognostic across several other subtypes and can be validated in external cohorts. We then use a data-driven approach to seek the effects of miRNAs associated with differential co-expression of mRNAs, and find that miRNAs act as modulators of mRNA-mRNA interactions rather than as on-off molecular switches. We demonstrate such an important modulatory role for miRNAs in the biology of CNA-devoid breast cancers, a common subtype in which the immune response is prominent. These findings represent a new framework for studying the biology of miRNAs in human breast cancer.
- Published
- 2013
20. The genetic landscape of high-risk neuroblastoma
- Author
-
Jenny Q. Qian, Nina Thiessen, Daniela S. Gerhard, Yvonne Moyer, Shahab Asgharzadeh, Inanc Birol, Jun S. Wei, Baljit Kamoh, Marco A. Marra, Gad Getz, Javed Khan, Adam Kiezun, Stacey Gabriel, Angela Tam, Jaime M. Guidry Auvil, Wendy B. London, Lee Lichenstein, Scott L. Carter, Chip Stewart, Jaegil Kim, Malcolm A. Smith, Readman Chiu, Kristina A. Cole, Maura Diamond, Richard Sposto, Aaron McKenna, Martin Hirst, Matthew Meyerson, Allan Lo, Julie M. Gastier-Foster, Martin Krzywinski, Alireza Hadj Khodabakshi, Michael S. Lawrence, Andrew Wood, Steven J.M. Jones, Richard Corbett, Daniel Auclair, Michael D. Hogarty, Trevor J. Pugh, Carrie Sougnez, Lingyun Ji, Shaun D. Jackman, Richard A. Moore, Kristian Cibulskis, Robert C. Seeger, Yongjun Zhao, Megan Hanna, Edward F. Attiyeh, Sharon J. Diskin, Adrian Ally, Yael P. Mosse, Erica Shefler, Andrey Sivachenko, Olena Morozova, John M. Maris, Chandra Sekhar Pedamallu, Alex H. Ramos, Karen Mungall, Thomas C. Badgett, Eric S. Lander, Massachusetts Institute of Technology. Department of Biology, and Lander, Eric S.
- Subjects
Neuroblastoma RAS viral oncogene homolog ,Biology ,medicine.disease_cause ,Polymorphism, Single Nucleotide ,Neuroblastoma ,03 medical and health sciences ,0302 clinical medicine ,Germline mutation ,Cell Line, Tumor ,Genetics ,medicine ,Humans ,Exome ,Genetic Predisposition to Disease ,Mutation frequency ,ATRX ,030304 developmental biology ,0303 health sciences ,Mutation ,Genome, Human ,Sequence Analysis, DNA ,medicine.disease ,3. Good health ,PTPN11 ,030220 oncology & carcinogenesis ,Cancer research ,Transcriptome - Abstract
Neuroblastoma is a malignancy of the developing sympathetic nervous system that often presents with widespread metastatic disease, resulting in survival rates of less than 50%. To determine the spectrum of somatic mutation in high-risk neuroblastoma, we studied 240 affected individuals (cases) using a combination of whole-exome, genome and transcriptome sequencing as part of the Therapeutically Applicable Research to Generate Effective Treatments (TARGET) initiative. Here we report a low median exonic mutation frequency of 0.60 per Mb (0.48 nonsilent) and notably few recurrently mutated genes in these tumors. Genes with significant somatic mutation frequencies included ALK (9.2% of cases), PTPN11 (2.9%), ATRX (2.5%, and an additional 7.1% had focal deletions), MYCN (1.7%, causing a recurrent p.Pro44Leu alteration) and NRAS (0.83%). Rare, potentially pathogenic germline variants were significantly enriched in ALK, CHEK2, PINK1 and BARD1. The relative paucity of recurrent somatic mutations in neuroblastoma challenges current therapeutic strategies that rely on frequently altered oncogenic drivers., National Human Genome Research Institute (U.S.) (Grant U54HG003067), National Cancer Institute (U.S.) (Contract HHSN261200800001E)
- Published
- 2013
21. Poly-gene fusion transcripts and chromothripsis in prostate cancer
- Author
-
Martin E. Gleave, Robert Shukin, Brian McConeghy, Alexander W. Wyatt, Colin Collins, S. Cenk Sahinalp, Andrew McPherson, Yongjun Zhao, Steven J.M. Jones, Fan Mo, Chunxiao Wu, Yuzhuo Wang, Anna Lapuk, Marco A. Marra, Dong Lin, and Stanislav Volik
- Subjects
Chromosome Aberrations ,Male ,Genetics ,Cancer Research ,Mutation ,Chromothripsis ,Prostatic Neoplasms ,Cancer ,Chromoplexy ,Biology ,medicine.disease ,medicine.disease_cause ,Genome ,Fusion gene ,Fusion transcript ,Cell Line, Tumor ,medicine ,Humans ,Gene Fusion ,Gene - Abstract
Complex genome rearrangements are frequently observed in cancer but their impact on tumor molecular biology is largely unknown. Recent studies have identified a new phenomenon involving the simultaneous generation of tens to hundreds of genomic rearrangements, called chromothripsis. To understand the molecular consequences of these events, we sequenced the genomes and transcriptomes of two prostate tumors exhibiting evidence of chromothripsis. We identified several complex fusion transcripts, each containing sequence from three different genes, originating from different parts of the genome. One such poly-gene fusion transcript appeared to be expressed from a chain of small genomic fragments. Furthermore, we detected poly-gene fusion transcripts in the prostate cancer cell line LNCaP, suggesting they may represent a common phenomenon. Finally in one tumor with chromothripsis, we identified multiple mutations in the p53 signaling pathway, expanding on recent work associating aberrant DNA damage response mechanisms with chromothripsis. Overall, our data show that chromothripsis can manifest as massively rearranged transcriptomes. The implication that multigenic changes can give rise to poly-gene fusion transcripts is potentially of great significance to cancer genetics. V C 2012 Wiley Periodicals, Inc.
- Published
- 2012
22. Genetic Alterations Activating Kinase and Cytokine Receptor Signaling in High-Risk Acute Lymphoblastic Leukemia
- Author
-
William E. Evans, Jared Becksfort, I-Ming Chen, Chunhua Yan, Lei Wei, Charles G. Mullighan, Cheryl L. Willman, Mignon L. Loh, James R. Downing, Michelle L. Churchman, Karen Mungall, Marco A. Marra, Ross L. Levine, Xiaoping Su, Neil P. Shah, Jinghui Zhang, Richard C. Harvey, Yongjun Zhao, Ching-Hon Pui, Shann-Ching Chen, Shannon L. Maude, William L. Carroll, Kane Tse, Eric Larsen, Steven J.M. Jones, Stephen P. Hunger, Debbie Payne-Turner, Jing Ma, David T. Teachey, Xiang Chen, Stephan A. Grupp, Richard Finney, Guillermo Garcia-Manero, Maria Kleppe, Inanc Birol, Sima Jeha, Ryan D. Morin, Gregory H. Reaman, Malcolm A. Smith, Richard A. Moore, Kenneth E Buetow, Michael N. Edmonson, Corynn Kasap, Ying Hu, Meenakshi Devidas, Daniela S. Gerhard, Kathryn G. Roberts, Steven W. Paugh, and Martin Hirst
- Subjects
Cancer Research ,Oncogene Proteins, Fusion ,Messenger ,DNA Mutational Analysis ,Cell Transformation ,medicine.disease_cause ,Mice ,0302 clinical medicine ,Recurrence ,Risk Factors ,hemic and lymphatic diseases ,Receptors ,2.1 Biological and endogenous factors ,Philadelphia Chromosome ,Phosphorylation ,Aetiology ,Sequence Deletion ,Cancer ,Oncogene Proteins ,Leukemic ,Gene Rearrangement ,Pediatric ,0303 health sciences ,Mutation ,ABL ,Gene Expression Regulation, Leukemic ,Hematology ,Precursor Cell Lymphoblastic Leukemia-Lymphoma ,Protein-Tyrosine Kinases ,Platelet-Derived Growth Factor beta ,3. Good health ,Leukemia ,Cell Transformation, Neoplastic ,Oncology ,030220 oncology & carcinogenesis ,Signal transduction ,Tyrosine kinase ,Receptor ,Biotechnology ,Signal Transduction ,Pediatric Research Initiative ,Childhood Leukemia ,Pediatric Cancer ,Molecular Sequence Data ,Oncology and Carcinogenesis ,PDGFRB ,Biology ,Philadelphia chromosome ,Article ,Receptor, Platelet-Derived Growth Factor beta ,03 medical and health sciences ,Rare Diseases ,Genetics ,medicine ,Animals ,Humans ,Genetic Predisposition to Disease ,RNA, Messenger ,Oncology & Carcinogenesis ,Receptors, Cytokine ,Fusion ,Cytokine ,Protein Kinase Inhibitors ,030304 developmental biology ,Neoplastic ,Base Sequence ,Human Genome ,Neurosciences ,Cell Biology ,medicine.disease ,Molecular biology ,Erythropoietin receptor ,Enzyme Activation ,Orphan Drug ,Gene Expression Regulation ,Trans-Activators ,Cancer research ,RNA - Abstract
SummaryGenomic profiling has identified a subtype of high-risk B-progenitor acute lymphoblastic leukemia (B-ALL) with alteration of IKZF1, a gene expression profile similar to BCR-ABL1-positive ALL and poor outcome (Ph-like ALL). The genetic alterations that activate kinase signaling in Ph-like ALL are poorly understood. We performed transcriptome and whole genome sequencing on 15 cases of Ph-like ALL and identified rearrangements involving ABL1, JAK2, PDGFRB, CRLF2, and EPOR, activating mutations of IL7R and FLT3, and deletion of SH2B3, which encodes the JAK2-negative regulator LNK. Importantly, several of these alterations induce transformation that is attenuated with tyrosine kinase inhibitors, suggesting the treatment outcome of these patients may be improved with targeted therapy.
- Published
- 2012
23. Identification and characterization of Hoxa9 binding sites in hematopoietic cells
- Author
-
Jay L. Hess, Kajal V. Sitwala, Joel Bronstein, Gordon Robertson, Monisha Dandekar, Yongsheng Huang, Martin Hirst, Steven J.M. Jones, James W. MacDonald, Thomas Zeng, Timothee Cezard, Daniel S. Sanders, Cailin Collins, Misha Bilenky, Nina Thiessen, Alfred O. Hero, and Yongjun Zhao
- Subjects
Epigenomics ,Chromatin Immunoprecipitation ,Hematopoiesis and Stem Cells ,Blotting, Western ,Immunology ,Bone Marrow Cells ,Enhancer RNAs ,Biology ,Real-Time Polymerase Chain Reaction ,Biochemistry ,Mice ,Biomarkers, Tumor ,Animals ,MYB ,RNA, Messenger ,Myeloid Ecotropic Viral Integration Site 1 Protein ,Hox gene ,Enhancer ,Transcription factor ,Oligonucleotide Array Sequence Analysis ,Homeodomain Proteins ,Genetics ,Binding Sites ,Leukemia ,Gene Expression Regulation, Leukemic ,Reverse Transcriptase Polymerase Chain Reaction ,Gene Expression Profiling ,Acetylation ,Cell Biology ,Hematology ,Hematopoiesis ,Neoplasm Proteins ,Mice, Inbred C57BL ,Enhancer Elements, Genetic ,Histone ,biology.protein ,Female ,Chromatin immunoprecipitation ,Transcription Factors - Abstract
The clustered homeobox proteins play crucial roles in development, hematopoiesis, and leukemia, yet the targets they regulate and their mechanisms of action are poorly understood. Here, we identified the binding sites for Hoxa9 and the Hox cofactor Meis1 on a genome-wide level and profiled their associated epigenetic modifications and transcriptional targets. Hoxa9 and the Hox cofactor Meis1 cobind at hundreds of highly evolutionarily conserved sites, most of which are distant from transcription start sites. These sites show high levels of histone H3K4 monomethylation and CBP/P300 binding characteristic of enhancers. Furthermore, a subset of these sites shows enhancer activity in transient transfection assays. Many Hoxa9 and Meis1 binding sites are also bound by PU.1 and other lineage-restricted transcription factors previously implicated in establishment of myeloid enhancers. Conditional Hoxa9 activation is associated with CBP/P300 recruitment, histone acetylation, and transcriptional activation of a network of proto-oncogenes, including Erg, Flt3, Lmo2, Myb, and Sox4. Collectively, this work suggests that Hoxa9 regulates transcription by interacting with enhancers of genes important for hematopoiesis and leukemia.
- Published
- 2012
24. The genomic and transcriptomic landscape of anaplastic thyroid cancer: implications for therapy
- Author
-
Sam M. Wiseman, Martin Hirst, Marco A. Marra, Katayoon Kasaian, Steven J.M. Jones, Blair Walker, Richard A. Moore, Jacqueline E. Schein, Andrew J. Mungall, and Yongjun Zhao
- Subjects
Male ,Cancer Research ,endocrine system diseases ,SS18-SLC5A11 fusion ,Bioinformatics ,Thyroid Carcinoma, Anaplastic ,Proto-Oncogene Mas ,Papillary thyroid cancer ,Epigenesis, Genetic ,0302 clinical medicine ,Molecular Targeted Therapy ,Thyroid cancer ,0303 health sciences ,Thyroid ,Anaplastic thyroid carcinoma ,cell line ,Genomics ,Middle Aged ,3. Good health ,Gene Expression Regulation, Neoplastic ,medicine.anatomical_structure ,Oncology ,mTOR signaling pathway ,Thyroid Cancer, Papillary ,030220 oncology & carcinogenesis ,whole genome and transcriptome sequencing ,Research Article ,Biology ,Thyroid carcinoma ,03 medical and health sciences ,Genetic Heterogeneity ,Cell Line, Tumor ,Genetics ,medicine ,Carcinoma ,Humans ,Epigenetics ,Thyroid Neoplasms ,Anaplastic thyroid cancer ,EP300 ,030304 developmental biology ,Gene Expression Profiling ,MKRN1-BRAF fusion ,Genetic Variation ,Sequence Analysis, DNA ,medicine.disease ,Carcinoma, Papillary ,epigenetic alterations ,FGFR2-OGDH fusion ,Mutation ,therapy targets - Abstract
Background Anaplastic thyroid carcinoma is the most undifferentiated form of thyroid cancer and one of the deadliest of all adult solid malignancies. Here we report the first genomic and transcriptomic profile of anaplastic thyroid cancer including those of several unique cell lines and outline novel potential drivers of malignancy and targets of therapy. Methods We describe whole genomic and transcriptomic profiles of 1 primary anaplastic thyroid tumor and 3 authenticated cell lines. Those profiles augmented by the transcriptomes of 4 additional and unique cell lines were compared to 58 pairs of papillary thyroid carcinoma and matched normal tissue transcriptomes from The Cancer Genome Atlas study. Results The most prevalent mutations were those of TP53 and BRAF; repeated alterations of the epigenetic machinery such as frame-shift deletions of HDAC10 and EP300, loss of SMARCA2 and fusions of MECP2, BCL11A and SS18 were observed. Sequence data displayed aneuploidy and large regions of copy loss and gain in all genomes. Common regions of gain were however evident encompassing chromosomes 5p and 20q. We found novel anaplastic gene fusions including MKRN1-BRAF, FGFR2-OGDH and SS18-SLC5A11, all expressed in-frame fusions involving a known proto-oncogene. Comparison of the anaplastic thyroid cancer expression datasets with the papillary thyroid cancer and normal thyroid tissue transcriptomes suggested several known drug targets such as FGFRs, VEGFRs, KIT and RET to have lower expression levels in anaplastic specimens compared with both papillary thyroid cancers and normal tissues, confirming the observed lack of response to therapies targeting these pathways. Further integrative data analysis identified the mTOR signaling pathway as a potential therapeutic target in this disease. Conclusions Anaplastic thyroid carcinoma possessed heterogeneous and unique profiles revealing the significance of detailed molecular profiling of individual tumors and the treatment of each as a unique entity; the cell line sequence data promises to facilitate the more accurate and intentional drug screening studies for anaplastic thyroid cancer. Electronic supplementary material The online version of this article (doi:10.1186/s12885-015-1955-9) contains supplementary material, which is available to authorized users.
- Published
- 2015
25. Concurrent CIC mutations, IDH mutations, and 1p/19q loss distinguish oligodendrogliomas from other cancers
- Author
-
Jennifer A. Chan, Richard A. Moore, Yaron S. Butterfield, Suganthi Chittaranjan, David N. Louis, Olena Morozova, J. Gregory Cairncross, Marlo Firme, Gloria Roldán, Marco Perizzolo, Charles Chesnelong, Inanc Birol, Samuel Weiss, Gregg B. Morin, Stephen Yip, Alexandra Maslova, Karen Mungall, Aly Karsan, Richard Corbett, Shaun D. Jackman, Sean D. Young, Readman Chiu, Richard Varhol, Marco A. Marra, Haiyan Li, Jessica Tamura-Wells, Eric Chuah, Jenny Q. Qian, Yongjun Zhao, Jianghong An, Nina Thiessen, Eric E. Smith, Rod Docking, Wei Wu, Andrew J. Mungall, Steven J.M. Jones, Annie Moradian, Michael D. Blough, and Martin Hirst
- Subjects
Mutation rate ,IDH1 ,Oligodendroglioma ,Kaplan-Meier Estimate ,Biology ,medicine.disease_cause ,IDH2 ,Article ,Disease-Free Survival ,Pathology and Forensic Medicine ,Exon ,Biomarkers, Tumor ,medicine ,Humans ,Allele ,Exome sequencing ,Genetics ,Mutation ,Brain Neoplasms ,medicine.disease ,Isocitrate Dehydrogenase ,Repressor Proteins ,Chromosomes, Human, Pair 1 ,Neoplasm Grading ,Chromosomes, Human, Pair 19 - Abstract
Oligodendroglioma is characterized by unique clinical, pathological, and genetic features. Recurrent losses of chromosomes 1p and 19q are strongly associated with this brain cancer but knowledge of the identity and function of the genes affected by these alterations is limited. We performed exome sequencing on a discovery set of 16 oligodendrogliomas with 1p/19q co-deletion to identify new molecular features at base-pair resolution. As anticipated, there was a high rate of IDH mutations: all cases had mutations in either IDH1 (14/16) or IDH2 (2/16). In addition, we discovered somatic mutations and insertions/deletions in the CIC gene on chromosome 19q13.2 in 13/16 tumours. These discovery set mutations were validated by deep sequencing of 13 additional tumours, which revealed seven others with CIC mutations, thus bringing the overall mutation rate in oligodendrogliomas in this study to 20/29 (69%). In contrast, deep sequencing of astrocytomas and oligoastrocytomas without 1p/19q loss revealed that CIC alterations were otherwise rare (1/60; 2%). Of the 21 non-synonymous somatic mutations in 20 CIC-mutant oligodendrogliomas, nine were in exon 5 within an annotated DNA-interacting domain and three were in exon 20 within an annotated protein-interacting domain. The remaining nine were found in other exons and frequently included truncations. CIC mutations were highly associated with oligodendroglioma histology, 1p/19q co-deletion, and IDH1/2 mutation (p < 0.001). Although we observed no differences in the clinical outcomes of CIC mutant versus wild-type tumours, in a background of 1p/19q co-deletion, hemizygous CIC mutations are likely important. We hypothesize that the mutant CIC on the single retained 19q allele is linked to the pathogenesis of oligodendrogliomas with IDH mutation. Our detailed study of genetic aberrations in oligodendroglioma suggests a functional interaction between CIC mutation, IDH1/2 mutation, and 1p/19q co-deletion.
- Published
- 2011
26. SNP discovery in black cottonwood (Populus trichocarpa) by population transcriptome resequencing
- Author
-
Carl J. Douglas, Angela Tam, Steven J.M. Jones, Richard D. Moore, Timothee Cezard, Yongjun Zhao, Johnson Pang, Nina Thiessen, Inanc Birol, Armando Geraldes, Michael Friedmann, Shucai Wang, and Quentin C. B. Cronk
- Subjects
Populus trichocarpa ,Genetics ,education.field_of_study ,biology ,fungi ,Population ,Single-nucleotide polymorphism ,biology.organism_classification ,Genome ,DNA sequencing ,Transcriptome ,Intergenic region ,education ,Gene ,Ecology, Evolution, Behavior and Systematics ,Biotechnology - Abstract
The western black cottonwood (Populus trichocarpa) was the first tree to have its genome fully sequenced and has emerged as the model species for the study of secondary growth and wood formation. It is also a good candidate species for the production of lignocellulosic biofuels. Here, we present and make available to the research community the results of the sequencing of the transcriptome of developing xylem in 20 accessions with high-throughput next generation sequencing technology. We found over 0.5 million putative single nucleotide polymorphisms (SNPs) in 26,595 genes that are expressed in developing secondary xylem. More than two-thirds of all SNPs were found in annotated exons, with 18% and 14% in regions of the genome annotated as introns and intergenic, respectively, where only 3% and 4% of sequence reads mapped. This suggests that the current annotation of the poplar genome is remarkably incomplete and that there are many transcripts and novel genes waiting to be annotated. We hope that this resource will stimulate further research in expression profiling, detection of alternative splicing and adaptive evolution in poplar.
- Published
- 2011
27. Deep annotation of Drosophila melanogaster microRNAs yields insights into their processing, modification, and emergence
- Author
-
Nicolas Robine, Marco A. Marra, Norbert Perrimon, Diane Bortolamiol-Becet, Gregory J. Hannon, Anastasia Samsonova, Raquel Martin, Ammar S. Naqvi, Yongjun Zhao, Katsutomo Okamura, Zhiping Weng, Jui-Hung Hung, Jakub Orzechowski Westholm, Phillip D. Zamore, Qi Dai, Eric C. Lai, Eugene Berezikov, Stem Cell Aging Leukemia and Lymphoma (SALL), Restoring Organ Function by Means of Regenerative Medicine (REGENERATE), and Hubrecht Institute for Developmental Biology and Stem Cell Research
- Subjects
Untranslated region ,Male ,Ribonuclease III ,Mutant ,Cell Line ,microRNA ,Genetics ,Animals ,RNA, Antisense ,RNA, Messenger ,Genetics (clinical) ,Drosha ,biology ,Base Sequence ,Research ,RNA ,Computational Biology ,Molecular Sequence Annotation ,Argonaute ,biology.organism_classification ,MicroRNAs ,Drosophila melanogaster ,Gene Expression Regulation ,Transfer RNA ,Female ,RNA Editing ,Sequence Alignment - Abstract
Since the initial annotation of miRNAs from cloned short RNAs by the Ambros, Tuschl, and Bartel groups in 2001, more than a hundred studies have sought to identify additional miRNAs in various species. We report here a meta-analysis of short RNA data from Drosophila melanogaster, aggregating published libraries with 76 data sets that we generated for the modENCODE project. In total, we began with more than 1 billion raw reads from 187 libraries comprising diverse developmental stages, specific tissue- and cell-types, mutant conditions, and/or Argonaute immunoprecipitations. We elucidated several features of known miRNA loci, including multiple phased byproducts of cropping and dicing, abundant alternative 5′ termini of certain miRNAs, frequent 3′ untemplated additions, and potential editing events. We also identified 49 novel genomic locations of miRNA production, and 61 additional candidate loci with limited evidence for miRNA biogenesis. Although these loci broaden the Drosophila miRNA catalog, this work supports the notion that a restricted set of cellular transcripts is competent to be specifically processed by the Drosha/Dicer-1 pathway. Unexpectedly, we detected miRNA production from coding and untranslated regions of mRNAs and found the phenomenon of miRNA production from the antisense strand of known loci to be common. Altogether, this study lays a comprehensive foundation for the study of miRNA diversity and evolution in a complex animal model.
- Published
- 2011
28. Alternative expression analysis by RNA sequencing
- Author
-
Gregg B. Morin, Allen Delaney, A. Sorana Morrissy, Yongjun Zhao, Haiyan I. Li, Richard Corbett, Isabella T. Tai, Rodrigo Goya, Michelle J. Tang, Adrian Ally, Helen McDonald, Ryan D. Morin, Gordon Robertson, Jennifer Asano, Suganthi Chittaranjan, Trevor J. Pugh, Jill Mwenifumbo, Thomas Zeng, Ying-Chen Hou, Kevin Teague, Obi L. Griffith, Malachi Griffith, Susanna Y. Chan, Marco A. Marra, Martin Hirst, and Steven J.M. Jones
- Subjects
Gene isoform ,Antimetabolites, Antineoplastic ,genetic processes ,Gene Expression ,Biology ,Biochemistry ,Cell Line, Tumor ,Databases, Genetic ,Expression analysis ,Humans ,Protein Isoforms ,natural sciences ,RNA, Messenger ,Molecular Biology ,Oligonucleotide Array Sequence Analysis ,Sequence (medicine) ,Expressed Sequence Tags ,Genetics ,Reverse Transcriptase Polymerase Chain Reaction ,Sequence Analysis, RNA ,Gene Expression Profiling ,RNA ,Cell Biology ,Alternative Splicing ,Drug Resistance, Neoplasm ,Fluorouracil ,Colorectal Neoplasms ,Sequence Alignment ,Biotechnology - Abstract
In alternative expression analysis by sequencing (ALEXA-seq), we developed a method to analyze massively parallel RNA sequence data to catalog transcripts and assess differential and alternative expression of known and predicted mRNA isoforms in cells and tissues. As proof of principle, we used the approach to compare fluorouracil-resistant and -nonresistant human colorectal cancer cell lines. We assessed the sensitivity and specificity of the approach by comparison to exon tiling and splicing microarrays and validated the results with reverse transcription-PCR, quantitative PCR and Sanger sequencing. We observed global disruption of splicing in fluorouracil-resistant cells characterized by expression of new mRNA isoforms resulting from exon skipping, alternative splice site usage and intron retention. Alternative expression annotation databases, source code, a data viewer and other resources to facilitate analysis are available at http://www.alexaplatform.org/alexa_seq/.
- Published
- 2010
29. Conserved role of intragenic DNA methylation in regulating alternative promoters
- Author
-
Gustavo Turecki, Mikhail Bilenky, Joseph F. Costello, Chris Fiore, Martin Hirst, Tracy J. Ballinger, Nina Thiessen, Allen Delaney, Cydney B. Nielsen, Ting Wang, Vivi M. Heine, David Haussler, Brett E. Johnson, Xiaoyun Xing, Steven J.M. Jones, Yongjun Zhao, Shaun D. Fouse, Marco A. Marra, Raman P. Nagarajan, David H. Rowitch, Richard Varhol, Alika K. Maunakea, Chibo Hong, Maximiliaan Schillebeeckx, Cletus D'Souza, Ksenya Shchors, and Pediatric surgery
- Subjects
Male ,Transcription, Genetic ,Intragenic DNA methylation ,General Science & Technology ,1.1 Normal biological development and functioning ,Bisulfite sequencing ,Nerve Tissue Proteins ,Biology ,Inbred C57BL ,Article ,Cell Line ,Histones ,Promoter Regions ,Mice ,Epigenetics of physical exercise ,Genetic ,Genetics ,Animals ,Humans ,alternate promoters ,Promoter Regions, Genetic ,SHANK3 ,RNA-Directed DNA Methylation ,Conserved Sequence ,Regulation of gene expression ,Multidisciplinary ,Intergenic ,Microfilament Proteins ,Brain ,Methylation ,DNA ,DNA Methylation ,Middle Aged ,Frontal Lobe ,Mice, Inbred C57BL ,Differentially methylated regions ,CpG site ,Gene Expression Regulation ,Organ Specificity ,DNA methylation ,DNA, Intergenic ,CpG Islands ,Generic health relevance ,comparative epigenomics ,Carrier Proteins ,Transcription - Abstract
Although it is known that the methylation of DNA in 5' promoters suppresses gene expression, the role of DNA methylation in gene bodies is unclear. In mammals, tissue- and cell type-specific methylation is present in a small percentage of 5' CpG island (CGI) promoters, whereas a far greater proportion occurs across gene bodies, coinciding with highly conserved sequences. Tissue-specific intragenic methylation might reduce, or, paradoxically, enhance transcription elongation efficiency. Capped analysis of gene expression (CAGE) experiments also indicate that transcription commonly initiates within and between genes. To investigate the role of intragenic methylation, we generated a map of DNA methylation from the human brain encompassing 24.7 million of the 28 million CpG sites. From the dense, high-resolution coverage of CpG islands, the majority of methylated CpG islands were shown to be in intragenic and intergenic regions, whereas less than 3% of CpG islands in 5' promoters were methylated. The CpG islands in all three locations overlapped with RNA markers of transcription initiation, and unmethylated CpG islands also overlapped significantly with trimethylation of H3K4, a histone modification enriched at promoters. The general and CpG-island-specific patterns of methylation are conserved in mouse tissues. An in-depth investigation of the human SHANK3 locus and its mouse homologue demonstrated that this tissue-specific DNA methylation regulates intragenic promoter activity in vitro and in vivo. These methylation-regulated, alternative transcripts are expressed in a tissue- and cell type-specific manner, and are expressed differentially within a single cell type from distinct brain regions. These results support a major role for intragenic methylation in regulating cell context-specific alternative promoters in gene bodies.
- Published
- 2010
30. Locus co-occupancy, nucleosome positioning, and H3K4me1 regulate the functionality of FOXA2-, HNF4A-, and PDX1-bound loci in islets and liver
- Author
-
Sam Lee, Mikhail Bilenky, Pamela A. Hoodless, Marco A. Marra, Rebecca Cullum, Brad G. Hoffman, Derek L. Dai, Leping Li, Elizabeth D. Wederell, Cheryl D. Helgason, Gordon Robertson, Angela Tam, Martin Hirst, Nina Thiessen, Steven J.M. Jones, Inanc Birol, Baljit Kamoh, Mike Beach, Bogard Zavaglia, Galina Soukhatcheva, Yongjun Zhao, C. Bruce Verchere, and Timothee Cezard
- Subjects
endocrine system ,Molecular Sequence Data ,Response element ,Regulatory Sequences, Nucleic Acid ,Biology ,Histones ,Islets of Langerhans ,Mice ,Sp3 transcription factor ,Genetics ,Animals ,Enhancer ,Genetics (clinical) ,Homeodomain Proteins ,Binding Sites ,Base Sequence ,General transcription factor ,Gene Expression Profiling ,Research ,Eukaryotic transcription ,Promoter ,TCF4 ,Nucleosomes ,Cell biology ,Hepatocyte Nuclear Factor 4 ,Liver ,Genetic Loci ,TAF2 ,Hepatocyte Nuclear Factor 3-beta ,Trans-Activators - Abstract
The liver and pancreas share a common origin and coexpress several transcription factors. To gain insight into the transcriptional networks regulating the function of these tissues, we globally identify binding sites for FOXA2 in adult mouse islets and liver, PDX1 in islets, and HNF4A in liver. Because most eukaryotic transcription factors bind thousands of loci, many of which are thought to be inactive, methods that can discriminate functionally active binding events are essential for the interpretation of genome-wide transcription factor binding data. To develop such a method, we also generated genome-wide H3K4me1 and H3K4me3 localization data in these tissues. By analyzing our binding and histone methylation data in combination with comprehensive gene expression data, we show that H3K4me1 enrichment profiles discriminate transcription factor occupied loci into three classes: those that are functionally active, those that are poised for activation, and those that reflect pioneer-like transcription factor activity. Furthermore, we demonstrate that the regulated presence of H3K4me1-marked nucleosomes at transcription factor occupied promoters and enhancers controls their activity, implicating both tissue-specific transcription factor binding and nucleosome remodeling complex recruitment in determining tissue-specific gene expression. Finally, we apply these approaches to generate novel insights into how FOXA2, PDX1, and HNF4A cooperate to drive islet- and liver-specific gene expression.
- Published
- 2010
31. The genome sequence of the spontaneously hypertensive rat
- Author
-
Martin Hirst, Kathrin Saar, Giannino Patone, Michelle D. Johnson, Enrico Petretto, Paul Flicek, William M. McLaren, Catherine Morrissey, Timothy J. Aitman, Charles Rockland, Xosé M. Fernández-Suárez, Piero Carninci, Victor Guryev, Ewan Birney, Theodore W. Kurtz, Jacques Behmoaras, Michal Pravenec, Kathleen S. Rockland, Edwin Cuppen, Oliver Hummel, Charles Plessy, Santosh S. Atanur, Steven J.M. Jones, Norbert Hubner, Yongjun Zhao, Inanc Birol, Hubrecht Institute for Developmental Biology and Stem Cell Research, Stem Cell Aging Leukemia and Lymphoma (SALL), and Groningen Research Institute for Asthma and COPD (GRIAC)
- Subjects
Inbred SHR ,Transcription, Genetic ,Quantitative Trait Loci ,Gene Dosage ,Single-nucleotide polymorphism ,030204 cardiovascular system & hematology ,Biology ,Quantitative trait locus ,Genome ,Polymorphism, Single Nucleotide ,03 medical and health sciences ,0302 clinical medicine ,Genetic ,Rats, Inbred SHR ,Genetics ,Animals ,Copy-number variation ,Terminator ,Polymorphism ,Codon ,Gene ,Genetics (clinical) ,030304 developmental biology ,Whole genome sequencing ,0303 health sciences ,Research ,Single Nucleotide ,Rats ,Cardiovascular and Metabolic Diseases ,Expression quantitative trait loci ,Hypertension ,Codon, Terminator ,Transcription ,Reference genome - Abstract
The spontaneously hypertensive rat (SHR) is the most widely studied animal model of hypertension. Scores of SHR quantitative loci (QTLs) have been mapped for hypertension and other phenotypes. We have sequenced the SHR/OlaIpcv genome at 10.7-fold coverage by paired-end sequencing on the Illumina platform. We identified 3.6 million high-quality single nucleotide polymorphisms (SNPs) between the SHR/OlaIpcv and Brown Norway (BN) reference genome, with a high rate of validation (sensitivity 96.3%–98.0% and specificity 99%–100%). We also identified 343,243 short indels between the SHR/OlaIpcv and reference genomes. These SNPs and indels resulted in 161 gain or loss of stop codons and 629 frameshifts compared with the BN reference sequence. We also identified 13,438 larger deletions that result in complete or partial absence of 107 genes in the SHR/OlaIpcv genome compared with the BN reference and 588 copy number variants (CNVs) that overlap with the gene regions of 688 genes. Genomic regions containing genes whose expression had been previously mapped as cis-regulated expression quantitative trait loci (eQTLs) were significantly enriched with SNPs, short indels, and larger deletions, suggesting that some of these variants have functional effects on gene expression. Genes that were affected by major alterations in their coding sequence were highly enriched for genes related to ion transport, transport, and plasma membrane localization, providing insights into the likely molecular and cellular basis of hypertension and other phenotypes specific to the SHR strain. This near complete catalog of genomic differences between two extensively studied rat strains provides the starting point for complete elucidation, at the molecular level, of the physiological and pathophysiological phenotypic differences between individuals from these strains.
- Published
- 2010
32. Somatic mutations altering EZH2 (Tyr641) in follicular and diffuse large B-cell lymphomas of germinal-center origin
- Author
-
Hong Qian, Ryan D. Morin, Inanc Birol, Tesa M. Severson, Richard Varhol, Bruce Woolcock, Yongjun Zhao, Richard Corbett, Florian Kuchenbauer, Charlot Jf, Rodrigo Goya, Tcherpakov M, Joseph M. Connors, Marco A. Marra, Douglas E. Horsman, Randy D. Gascoyne, Nathalie A. Johnson, Robert A. Holt, Richard A. Moore, Andrew J. Mungall, Allen Delaney, Samuel Aparicio, Angela K Y Tam, Hao Zhu, Michelle Moksa, Jacquie Schein, Shashkin P, Martin Hirst, Damian Yap, Jianghong An, Steven J.M. Jones, Kimbara M, Humphries Rk, Paul Je, Merrill Boyle, Obi L. Griffith, Sohrab P. Shah, and Duane E Smailus
- Subjects
Male ,Somatic cell ,DNA Mutational Analysis ,0302 clinical medicine ,Lymphoma, Follicular ,0303 health sciences ,EZH2 ,Polycomb Repressive Complex 2 ,Exons ,Middle Aged ,PRC2 ,DNA-Binding Proteins ,Gene Expression Regulation, Neoplastic ,medicine.anatomical_structure ,030220 oncology & carcinogenesis ,Histone methyltransferase ,Female ,lipids (amino acids, peptides, and proteins) ,Lymphoma, Large B-Cell, Diffuse ,epigenetic ,Adult ,Molecular Sequence Data ,macromolecular substances ,Biology ,Article ,03 medical and health sciences ,Germline mutation ,Genetics ,medicine ,Humans ,Enhancer of Zeste Homolog 2 Protein ,Amino Acid Sequence ,genome ,B cell ,Aged ,030304 developmental biology ,Base Sequence ,Genome, Human ,Gene Expression Profiling ,fungi ,Germinal center ,Germinal Center ,medicine.disease ,Molecular biology ,Lymphoma ,Mutation ,Tyrosine ,Mutant Proteins ,RNA-seq ,WTSS ,transcriptome ,Diffuse large B-cell lymphoma ,Transcription Factors - Abstract
Follicular lymphoma (FL) and the GCB subtype of diffuse large B-cell lymphoma (DLBCL) derive from germinal center B cells. Targeted resequencing studies have revealed mutations in various genes encoding proteins in the NF-kappaB pathway that contribute to the activated B-cell (ABC) DLBCL subtype, but thus far few GCB-specific mutations have been identified. Here we report recurrent somatic mutations affecting the polycomb-group oncogene EZH2, which encodes a histone methyltransferase responsible for trimethylating Lys27 of histone H3 (H3K27). After the recent discovery of mutations in KDM6A (UTX), which encodes the histone H3K27me3 demethylase UTX, in several cancer types, EZH2 is the second histone methyltransferase gene found to be mutated in cancer. These mutations, which result in the replacement of a single tyrosine in the SET domain of the EZH2 protein (Tyr641), occur in 21.7% of GCB DLBCLs and 7.2% of FLs and are absent from ABC DLBCLs. Our data are consistent with the notion that EZH2 proteins with mutant Tyr641 have reduced enzymatic activity in vitro.
- Published
- 2010
33. Mutational evolution in a lobular breast tumour profiled at single nucleotide resolution
- Author
-
Martin Hirst, Carlos Caldas, Mark G. F. Sun, Leah M Prentice, Trevor J. Pugh, Yongjun Zhao, Richard Varhol, Robert A. Holt, Ryan D. Morin, Samuel Aparicio, Tesa M. Severson, Jaswinder Khattra, Greg Taylor, Karen A. Gelmon, Richard A. Moore, Peter H. Watson, Ryan Guliany, Christian Steidl, Andrew E. Teschendorff, Marco A. Marra, René L. Warren, Kane Tse, Sohrab P. Shah, David G. Huntsman, Angela Burleigh, Gulisa Turashvili, Steven J.M. Jones, Gillian Leung, Janine Senz, and Allen Delaney
- Subjects
Time Factors ,DNA Mutational Analysis ,Breast Neoplasms ,Biology ,medicine.disease_cause ,DNA sequencing ,Metastasis ,Evolution, Molecular ,Breast cancer ,medicine ,Humans ,Neoplasm Metastasis ,Gene ,Germ-Line Mutation ,Genetics ,Mutation ,Multidisciplinary ,Genome, Human ,Nucleotides ,Gene Expression Profiling ,Estrogen Receptor alpha ,Cancer ,medicine.disease ,Gene Expression Regulation, Neoplastic ,Adaptor Proteins, Vesicular Transport ,Mutagenesis ,Disease Progression ,RNA Editing ,Breast disease ,Signal Recognition Particle ,Genes, Neoplasm - Abstract
Recent advances in next generation sequencing have made it possible to precisely characterize all somatic coding mutations that occur during the development and progression of individual cancers. Here we used these approaches to sequence the genomes (>43-fold coverage) and transcriptomes of an oestrogen-receptor-alpha-positive metastatic lobular breast cancer at depth. We found 32 somatic non-synonymous coding mutations present in the metastasis, and measured the frequency of these somatic mutations in DNA from the primary tumour of the same patient, which arose 9 years earlier. Five of the 32 mutations (in ABCB11, HAUS3, SLC24A4, SNX4 and PALB2) were prevalent in the DNA of the primary tumour removed at diagnosis 9 years earlier, six (in KIF1C, USP28, MYH8, MORC1, KIAA1468 and RNASEH2A) were present at lower frequencies (1-13%), 19 were not detected in the primary tumour, and two were undetermined. The combined analysis of genome and transcriptome data revealed two new RNA-editing events that recode the amino acid sequence of SRP9 and COG3. Taken together, our data show that single nucleotide mutational heterogeneity can be a property of low or intermediate grade primary breast cancers and that significant evolution can occur with disease progression.
- Published
- 2009
34. Next-generation tag sequencing for cancer gene expression profiling
- Author
-
Helen McDonald, Yongjun Zhao, Allen Delaney, Thomas Zeng, A. Sorana Morrissy, Martin Hirst, Ryan D. Morin, Marco A. Marra, and Steven J.M. Jones
- Subjects
Sequence analysis ,Cost-Benefit Analysis ,Biology ,Models, Biological ,Transcriptome ,Exon ,Neoplasms ,Complementary DNA ,Methods ,Genetics ,Humans ,Protein Isoforms ,Serial analysis of gene expression ,Gene ,Genetics (clinical) ,Sequence Tagged Sites ,Regulation of gene expression ,Base Composition ,Genomic Library ,Gene Expression Profiling ,Genetic Variation ,Sequence Analysis, DNA ,Gene Expression Regulation, Neoplastic ,Gene expression profiling ,Algorithms - Abstract
We describe a new method, Tag-seq, which employs ultra high-throughput sequencing of 21 base pair cDNA tags for sensitive and cost-effective gene expression profiling. We compared Tag-seq data to LongSAGE data and observed improved representation of several classes of rare transcripts, including transcription factors, antisense transcripts, and intronic sequences, the latter possibly representing novel exons or genes. We observed increases in the diversity, abundance, and dynamic range of such rare transcripts and took advantage of the greater dynamic range of expression to identify, in cancers and normal libraries, altered expression ratios of alternative transcript isoforms. The strand-specific information of Tag-seq reads further allowed us to detect altered expression ratios of sense and antisense (S-AS) transcripts between cancer and normal libraries. S-AS transcripts were enriched in known cancer genes, while transcript isoforms were enriched in miRNA targeting sites. We found that transcript abundance had a stronger GC-bias in LongSAGE than Tag-seq, such that AT-rich tags were less abundant than GC-rich tags in LongSAGE. Tag-seq also performed better in gene discovery, identifying >98% of genes detected by LongSAGE and profiling a distinct subset of the transcriptome characterized by AT-rich genes, which was expressed at levels below those detectable by LongSAGE. Overall, Tag-seq is sensitive to rare transcripts, has less sequence composition bias relative to LongSAGE, and allows differential expression analysis for a greater range of transcripts, including transcripts encoding important regulatory molecules.
- Published
- 2009
35. De novo transcriptome assembly with ABySS
- Author
-
Jenny Q. Qian, Cydney B. Nielsen, Ryan D. Morin, Joseph M. Connors, Randy D. Gascoyne, Shaun D. Jackman, Jacqueline E. Schein, Richard Varhol, Martin Hirst, Doug Horsman, Yongjun Zhao, Steven J.M. Jones, Inanc Birol, Marco A. Marra, and Greg Stazyk
- Subjects
Statistics and Probability ,Genetics ,Genome ,Contig ,Java ,Base pair ,Shotgun sequencing ,Gene Expression Profiling ,De novo transcriptome assembly ,Computational Biology ,Sequence assembly ,Sequence Analysis, DNA ,Computational biology ,Biology ,Biochemistry ,Computer Science Applications ,Transcriptome ,Computational Mathematics ,Computational Theory and Mathematics ,Databases, Genetic ,Molecular Biology ,computer ,Software ,computer.programming_language - Abstract
Motivation: Whole transcriptome shotgun sequencing data from non-normalized samples offer unique opportunities to study the metabolic states of organisms. One can deduce gene expression levels using sequence coverage as a surrogate, identify coding changes or discover novel isoforms or transcripts. Especially for discovery of novel events, de novo assembly of transcriptomes is desirable. Results: Transcriptome from tumor tissue of a patient with follicular lymphoma was sequenced with 36 base pair (bp) single- and paired-end reads on the Illumina Genome Analyzer II platform. We assembled ∼194 million reads using ABySS into 66 921 contigs 100 bp or longer, with a maximum contig length of 10 951 bp, representing over 30 million base pairs of unique transcriptome sequence, or roughly 1% of the genome. Availability and Implementation: Source code and binaries of ABySS are freely available for download at http://www.bcgsc.ca/platform/bioinfo/software/abyss. Assembler tool is implemented in C++. The parallel version uses Open MPI. ABySS-Explorer tool is implemented in Java using the Java universal network/graph framework. Contact: ibirol@bcgsc.ca
- Published
- 2009
36. Profiling YB-1 target genes uncovers a new mechanism for MET receptor regulation in normal and malignant human mammary cells
- Author
-
Arezoo Astanehe, Anna L. Stratford, M R Finkbeiner, Helen Jiang, Ashleen Shadeo, Abbas Fotovati, Peter Eirew, Peter R. Mertens, Afshin Raouf, Sandra E. Dunn, Yongjun Zhao, Connie J. Eaves, Alastair H. Davies, Paolo M. Comoglio, Karen To, and Carla Boccaccio
- Subjects
Chromatin Immunoprecipitation ,Cancer Research ,medicine.medical_specialty ,Small interfering RNA ,Breast Neoplasms ,Electrophoretic Mobility Shift Assay ,medicine.disease_cause ,YB-1 ,03 medical and health sciences ,0302 clinical medicine ,Cell Line, Tumor ,Proto-Oncogene Proteins ,Internal medicine ,mammary progenitors ,Genetics ,medicine ,Humans ,Gene silencing ,Receptors, Growth Factor ,Breast ,Translation factor ,RNA, Small Interfering ,Promoter Regions, Genetic ,Molecular Biology ,030304 developmental biology ,0303 health sciences ,ChIp-on-chip ,biology ,Hepatocyte Growth Factor ,Met Receptor ,basal-like breast cancer ,Gene Expression Profiling ,Stem Cells ,CD44 ,Wnt signaling pathway ,Nuclear Proteins ,Proto-Oncogene Proteins c-met ,ChIP-on-chip ,DNA-Binding Proteins ,Endocrinology ,030220 oncology & carcinogenesis ,biology.protein ,Cancer research ,Female ,Y-Box-Binding Protein 1 ,Carcinogenesis ,Chromatin immunoprecipitation - Abstract
Basal-like breast cancers (BLBCs) are aggressive tumors with high relapse rates and poor survival. We recently reported that >70% of primary BLBCs express the oncogenic transcription/translation factor Y-box binding protein-1 (YB-1) and silencing it with small interfering RNAs (siRNAs) attenuates the growth of BLBC cell lines. To understand the basis of these earlier findings, we profiled YB-1:DNA complexes by chromatin immunoprecipitation (ChIP)-on-chip. Several tumor growth-promoting genes such as MET, CD44, CD49f, WNT and NOTCH family members were identified. In addition, YB-1 and MET are coordinately expressed in BLBC cell lines, as well as in normal human mammary progenitor cells. MET was confirmed to be a YB-1 target through traditional ChIP and gel-shift assays. More specifically, YB-1 binds to -1018 bp on the MET promoter. Silencing YB-1 with siRNA decreased MET promoter activity, transcripts, as well as protein levels and signaling. Conversely, expressing wild-type YB-1 or a constitutively active mutant YB-1 (D102) increased MET expression. Finally, silencing YB-1 or MET attenuated anchorage-independent growth of BLBC cell lines. Together, these findings implicate MET as a target of YB-1 that work in concert to promote BLBC growth.
- Published
- 2009
- Full Text
- View/download PDF
37. Genome-wide relationship between histone H3 lysine 4 mono- and tri-methylation and transcription factor binding
- Author
-
Timothee Cezard, Angela Tam, Michael Snyder, Marco A. Marra, Elizabeth D. Wederell, Pamela A. Hoodless, Mikhail Bilenky, Martin Krzywinski, Martin Hirst, Steven J.M. Jones, Ghia Euskirchen, A. Gordon Robertson, Nina Thiessen, Rebecca Cullum, Inanc Birol, Anthony P. Fejes, Thomas Zeng, and Yongjun Zhao
- Subjects
Chromatin Immunoprecipitation ,Histone H3 Lysine 4 ,Letter ,Regulatory Sequences, Nucleic Acid ,Methylation ,Histones ,Interferon-gamma ,Mice ,Sequence Homology, Nucleic Acid ,Genetics ,Transcriptional regulation ,Animals ,Humans ,Binding site ,Transcription factor ,Genetics (clinical) ,Cell Line, Transformed ,Regulation of gene expression ,Binding Sites ,Base Sequence ,biology ,Genome, Human ,Lysine ,Molecular biology ,Mice, Inbred C57BL ,STAT1 Transcription Factor ,Histone ,Gene Expression Regulation ,Hepatocyte Nuclear Factor 3-beta ,biology.protein ,H3K4me3 ,Female ,Chromatin immunoprecipitation ,HeLa Cells ,Protein Binding ,Transcription Factors - Abstract
We characterized the relationship of H3K4me1 and H3K4me3 at distal and proximal regulatory elements by comparing ChIP-seq profiles for these histone modifications and for two functionally different transcription factors: STAT1 in the immortalized HeLa S3 cell line, with and without interferon-gamma (IFNG) stimulation; and FOXA2 in mouse adult liver tissue. In unstimulated and stimulated HeLa cells, respectively, we determined approximately 270,000 and approximately 301,000 H3K4me1-enriched regions, and approximately 54,500 and approximately 76,100 H3K4me3-enriched regions. In mouse adult liver, we determined approximately 227,000 and approximately 34,800 H3K4me1 and H3K4me3 regions. Seventy-five percent of the approximately 70,300 STAT1 binding sites in stimulated HeLa cells and 87% of the approximately 11,000 FOXA2 sites in mouse liver were distal to known gene TSS; in both cell types, approximately 83% of these distal sites were associated with at least one of the two histone modifications, and H3K4me1 was associated with over 96% of marked distal sites. After filtering against predicted transcription start sites, 50% of approximately 26,800 marked distal IFNG-stimulated STAT1 binding sites, but 95% of approximately 5800 marked distal FOXA2 sites, were associated with H3K4me1 only. Results for HeLa cells generated additional insights into transcriptional regulation involving STAT1. STAT1 binding was associated with 25% of all H3K4me1 regions in stimulated HeLa cells, suggesting that a single transcription factor can interact with an unexpectedly large fraction of regulatory regions. Strikingly, for a large majority of the locations of stimulated STAT1 binding, the dominant H3K4me1/me3 combinations were established before activation, suggesting mechanisms independent of IFNG stimulation and high-affinity STAT1 binding.
- Published
- 2008
38. Human hematopoietic progenitors engraft in fetal canine recipients and expand with neonatal injection of fibroblasts expressing human hematopoietic cytokines
- Author
-
Anthony C. G. Abrams-Ogg, Stephen A. Kruth, Carolyn Lutzko, J. Paul Woods, Lisa Meertens, Ian D. Dubé, Yongjun Zhao, Margaret R. Hough, and Liheng Li
- Subjects
Cancer Research ,Myeloid ,Recombinant Fusion Proteins ,medicine.medical_treatment ,Transplantation, Heterologous ,Gestational Age ,Hematopoietic stem cell transplantation ,Biology ,Transfection ,Polymerase Chain Reaction ,Mice ,Dogs ,Graft Enhancement, Immunologic ,Species Specificity ,Granulocyte Colony-Stimulating Factor ,Genetics ,medicine ,Animals ,Humans ,Cell Lineage ,Myeloid Cells ,Lymphocytes ,Progenitor cell ,Molecular Biology ,Stem Cell Factor ,Graft Survival ,Hematopoietic Tissue ,Hematopoietic Stem Cell Transplantation ,DNA ,Cell Biology ,Hematology ,Fibroblasts ,Fetal Blood ,Adoptive Transfer ,Haematopoiesis ,medicine.anatomical_structure ,Animals, Newborn ,Cord blood ,Immunology ,Leukocytes, Mononuclear ,Hemangioblast ,Interleukin-3 ,Stromal Cells ,Stem cell ,Injections, Intraperitoneal - Abstract
Objective The development of large-animal models for human hematopoiesis will facilitate the study of human hematopoietic stem cells and their progenitors in vivo. In previous studies, human hematopoietic progenitors engrafted in fetal dogs and contributed to hematopoiesis for one year. Despite initially high levels of human cells, the proportion declined to less than 0.1% at 6 months, possibly due to inability of the canine hematopoietic microenvironment to support ongoing human hematopoiesis. In the current experiments we examined the potential of co-transplanting fibroblasts expressing human hematopoietic cytokines with the hematopoietic graft to increase the contribution of human progenitors to chimeric hematopoiesis. Methods Mid-gestation canine fetuses were injected with 1–3 × 10 7 human cord blood cells and 1 × 10 7 murine fibroblasts engineered to express human cytokines. Neonatal pups were boosted with additional injections of cytokine-expressing fibroblasts. Human cell engraftment was monitored by PCR amplification of human-specific DNA sequences from recipient hematopoietic tissues. Results Human hematopoietic cells were detected in 13/15 fetal recipients for at least 7 months. At time points up to 30 weeks of age, human DNA was detected in stimulated lymphocyte cultures, ∼0.1% of blood leukocytes and 1.5% (85/5757) of myeloid colonies. Eight months postinfusion, 1.7% of colony-forming units (CFUs) were of human origin. By one year 0.5% or less of myeloid colonies and less than 0.01% of blood leukocytes carried human DNA. Following an infusion of cytokine-expressing fibroblasts at one year, the proportion of human myeloid progenitors rose to 11.5% and remained detectable for 8 months. Conclusion These studies confirm that human hematopoietic progenitors can engraft in fetal pups and contribute to multilineage hematopoiesis. Infusion of cells expressing human cytokines is one approach to stimulate human hematopoietic progenitors in vivo and thus increase their contributions to chimeric hematopoiesis.
- Published
- 2002
39. Mutational analysis reveals the origin and therapy-driven evolution of recurrent glioma
- Author
-
Koki Aihara, Elise Charron, Andrew J. Mungall, Soonmee Cha, Chibo Hong, Sarah J. Nelson, Ivan Smirnov, Hiroyuki Aburatani, Yongjun Zhao, Saurabh Asthana, W. Clay Gustafson, Nobuhito Saito, Akitake Mukasa, Susan M. Chang, Richard A. Moore, William A. Weiss, Brett E. Johnson, Andrew W. Bollen, Marco A. Marra, Mitchel S. Berger, Adam B. Olshen, Cory Y. McLean, Llewellyn E. Jalbert, Shaun D. Fouse, Tali Mazor, Jun S. Song, Barry S. Taylor, Shogo Yamamoto, Michael R. Barnes, Joseph F. Costello, Steven J.M. Jones, Martin Hirst, Hiroki R. Ueda, and Kenji Tatsuno
- Subjects
DNA Mutational Analysis ,Bioinformatics ,Exome sequencing ,Cancer ,Multidisciplinary ,Retinoblastoma ,Brain Neoplasms ,TOR Serine-Threonine Kinases ,Brain ,Nuclear Proteins ,Glioma ,Alkylating ,Dacarbazine ,Local ,SMARCA4 ,medicine.drug ,Proto-Oncogene Proteins B-raf ,X-linked Nuclear Protein ,General Science & Technology ,Antineoplastic Agents ,Biology ,Recurrent Glioma ,Article ,Rare Diseases ,Genetics ,medicine ,Temozolomide ,Humans ,Antineoplastic Agents, Alkylating ,ATRX ,Human Genome ,Neurosciences ,DNA Helicases ,medicine.disease ,Brain Disorders ,Brain Cancer ,Neoplasm Recurrence ,Orphan Drug ,Good Health and Well Being ,Mutagenesis ,Cancer research ,Neoplasm Grading ,Neoplasm Recurrence, Local ,Tumor Suppressor Protein p53 ,Proto-Oncogene Proteins c-akt ,Transcription Factors - Abstract
Back with a Vengeance After surgery, gliomas (a type of brain tumor) recur in nearly all patients and often in a more aggressive form. Johnson et al. (p. 189 , published online 12 December 2013) used exome sequencing to explore whether recurrent tumors harbor different mutations than the primary tumors and whether the mutational profile in the recurrences is influenced by postsurgical treatment of patients with temozolomide (TMZ), a chemotherapeutic drug known to damage DNA. In more than 40% of cases, at least half of the mutations in the initial glioma were undetected at recurrence. The recurrent tumors in many of the TMZ-treated patients bore the signature of TMZ-induced mutagenesis and appeared to follow an evolutionary path to high-grade glioma distinct from that in untreated patients.
- Published
- 2013
40. Assembling the 20 Gb white spruce (Picea glauca) genome from whole-genome shotgun sequencing data
- Author
-
Jörg Bohlmann, Alvin D. Yanchuk, Kermit Ritland, Benjamin P. Vandervalk, Inanc Birol, Jean Bousquet, Andrew J. Mungall, Steven J.M. Jones, John MacKay, Shaun D. Jackman, Macaire M.S. Yuen, Anthony Raymond, Carol Ritland, Dana Brand, Heather Kirk, Pawan Pandoh, Christopher I. Keeling, Richard A. Moore, Robin J.N. Coope, Yongjun Zhao, Greg Taylor, Brian Boyle, Barry Jaquish, and Stephen Pleasance
- Subjects
Statistics and Probability ,Genetics ,Shotgun sequencing ,Sequence analysis ,fungi ,Sequence assembly ,Genomics ,Computational biology ,Biology ,Genome Analysis ,Original Papers ,Biochemistry ,Genome ,DNA sequencing ,Computer Science Applications ,Giga ,Evolution, Molecular ,Computational Mathematics ,Computational Theory and Mathematics ,Picea ,Molecular Biology ,Genome, Plant ,Illumina dye sequencing - Abstract
White spruce (Picea glauca) is a dominant conifer of the boreal forests of North America, and providing genomics resources for this commercially valuable tree will help improve forest management and conservation efforts. Sequencing and assembling the large and highly repetitive spruce genome though pushes the boundaries of the current technology. Here, we describe a whole-genome shotgun sequencing strategy using two Illumina sequencing platforms and an assembly approach using the ABySS software. We report a 20.8 giga base pairs draft genome in 4.9 million scaffolds, with a scaffold N50 of 20 356 bp. We demonstrate how recent improvements in the sequencing technology, especially increasing read lengths and paired end reads from longer fragments have a major impact on the assembly contiguity. We also note that scalable bioinformatics tools are instrumental in providing rapid draft assemblies. Availability: The Picea glauca genome sequencing and assembly data are available through NCBI (Accession#: ALWZ0100000000 PID: PRJNA83435). http://www.ncbi.nlm.nih.gov/bioproject/83435. Contact: ibirol@bcgsc.ca Supplementary information: Supplementary data are available at Bioinformatics online.
- Published
- 2013
41. Genomic and epigenomic landscapes of adult de novo acute myeloid leukemia
- Author
-
Ley, Timothy, Miller, Christopher, Ding, Li, Raphael, Benjamin J., Mungall, Andrew J., Robertson, A. Gordon, Hoadley, Katherine, Triche, Timothy J., Laird, Peter W., Baty, Jack D., Fulton, Lucinda L., Fulton, Robert, Heath, Sharon E., Kalicki Veizer, Joelle, Kandoth, Cyriac, Klco, Jeffery M., Koboldt, Daniel C., Kanchi, Krishna Latha, Shashikant, Kulkarni, M. S., P. h. D., F. A. C. M. G., Lamprecht, Tamara L., B. S., Washington, University, Louis, S. t., Larson, David E., P. h. D., Ling, Lin, M. S., Charles, Lu, Mclellan, Michael D., Mcmichael, Joshua F., the Genome Institute at Washington University, Jacqueline, Payton, M. D., P. h. D., Heather, Schmidt, Spencer, David H., Tomasson, Michael H., M. D., Siteman Cancer Center, S. t. Louis, Wallis, John W., Wartman, Lukas D., Watson, Mark A., John, Welch, Wendl, Michael C., Adrian, Ally, B. S. c., Miruna, Balasundaram, B. A. S. c., Inanc, Birol, Yaron, Butterfield, Readman, Chiu, M. S. c., Andy, Chu, Eric, Chuah, Hye Jung Chun, Richard, Corbett, Noreen, Dhalla, Ranabir, Guin, An, He, Carrie, Hirst, Martin, Hirst, Holt, Robert A., Steven, Jones, Aly, Karsan, Darlene, Lee, Haiyan I., Li, Marra, Marco A., Michael, Mayo, Moore, Richard A., Karen, Mungall, Jeremy, Parker, Erin, Pleasance, Patrick, Plettner, Jacquie, Schein, Dominik, Stoll, Lucas, Swanson, Angela, Tam, Nina, Thiessen, Richard, Varhol, Natasja, Wye, Yongjun, Zhao, M. S. c., D. V. M., British Columbia Cancer Agency's Genome Sciences Centre, Vancouver, Canada, Stacey, Gabriel, Gad, Getz, Carrie, Sougnez, Lihua, Zou, Broad Institute of Harvard, Massachusetts Institute of Technology, Cambridge, Ma, Mark D. M. Leiserson, B. A., Vandin, Fabio, Hsin Ta Wu, Brown, University, Center for Computational Molecular Biology, Providence, Ri, Frederick, Applebaum, Fred Hutchinson Cancer Research Center, Division of Medical Oncology, Seattle Cancer Care Alliance, Seattle, Baylin, Stephen B., Johns Hopkins University, Baltimore, Rehan, Akbani, Broom, Bradley M., Ken, Chen, Motter, Thomas C., B. A., Khanh, Nguyen, Weinstein, John N., Nianziang, Zhang, Anderson Cancer Center, University of Texas M. D., Houston, Ferguson, Martin L., Mlf, Consulting, Biotechnology Consultant, Boston, Christopher, Adams, Aaron, Black, Jay, Bowen, Julie Gastier Foster, Thomas, Grossman, Tara, Lichtenberg, Lisa, Wise, the Research Institute at Nationwide Children's Hospital, Columbus, Oh, Tanja, Davidsen, Demchok, John A., Mills Shaw, Kenna R., Margi, Sheth, National Cancer Institute, Bethesda, Md, Sofia, Heidi J., P. h. D., M. P. H., National Human Genome Research Institute, Liming, Yang, Downing, James R., Jude Children's Research Hospital, S. t., Memphis, Greg, Eley, Sciementis, Llc, Statham, Ga, Shelley, Alonso, Brenda, Ayala, Julien, Baboud, Mark, Backus, Barletta, Sean P., Berton, Dominique L., M. S. C. S., Chu, Anna L., Stanley, Girshik, Jensen, Mark A., Ari, Kahn, Prachi, Kothiyal, Nicholls, Matthew C., Pihl, Todd D., Pot, David A., Rohini, Raman, B. E., Sanbhadti, Rashmi N., Snyder, Eric E., Deepak, Srinivasan, Jessica, Walton, Yunhu, Wan, Zhining, Wang, Sra, International, Fairfax, Va, Issa, Jean Pierre J., Temple, University, Philadelphia, Michelle Le Beau, University of Chicago, Chicago, Martin, Carroll, University of Pennsylvania, Hagop Kantarjian, M. D., Steven, Kornblau, Bootwalla, Moiz S., B. S. c., M. S., Lai, Phillip H., Hui, Shen, Van Den Berg, David J., Weisenberger, Daniel J., University of Southern California, Epigenome, Center, Los, Angeles, Daniel C. Link, M. D., Walter, Matthew J., Ozenberger, Bradley A., Mardis, Elaine R., Peter, Westervelt, Graubert, Timothy A., Dipersio, John F., and Wilson, Richard K.
- Subjects
Myeloid ,Adult ,Epigenomics ,Male ,NPM1 ,Gene Expression ,CpG Islands ,DNA Methylation ,Female ,Gene Fusion ,Genome, Human ,Humans ,Leukemia, Myeloid, Acute ,MicroRNAs ,Middle Aged ,Sequence Analysis, DNA ,Mutation ,Acute ,Enasidenib ,Biology ,CEBPA ,Genetics ,Genome ,Leukemia ,Massive parallel sequencing ,MicroRNA sequencing ,Myeloid leukemia ,DNA ,General Medicine ,KMT2A ,biology.protein ,Sequence Analysis ,Nucleophosmin ,Human ,Comparative genomic hybridization - Abstract
BACKGROUND—Many mutations that contribute to the pathogenesis of acute myeloid leukemia (AML) are undefined. The relationships between patterns of mutations and epigenetic phenotypes are not yet clear. METHODS—We analyzed the genomes of 200 clinically annotated adult cases of de novo AML, using either whole-genome sequencing (50 cases) or whole-exome sequencing (150 cases), along with RNA and microRNA sequencing and DNA-methylation analysis. RESULTS—AML genomes have fewer mutations than most other adult cancers, with an average of only 13 mutations found in genes. Of these, an average of 5 are in genes that are recurrently mutated in AML. A total of 23 genes were significantly mutated, and another 237 were mutated in two or more samples. Nearly all samples had at least 1 nonsynonymous mutation in one of nine categories of genes that are almost certainly relevant for pathogenesis, including transcriptionfactor fusions (18% of cases), the gene encoding nucleophosmin (NPM1) (27%), tumorsuppressor genes (16%), DNA-methylation–related genes (44%), signaling genes (59%), chromatin-modifying genes (30%), myeloid transcription-factor genes (22%), cohesin-complex genes (13%), and spliceosome-complex genes (14%). Patterns of cooperation and mutual exclusivity suggested strong biologic relationships among several of the genes and categories. CONCLUSIONS—We identified at least one potential driver mutation in nearly all AML samples and found that a complex interplay of genetic events contributes to AML pathogenesis in individual patients. The databases from this study are widely available to serve as a foundation for further investigations of AML pathogenesis, classification, and risk stratification. (Funded by the National Institutes of Health.) The molecular pathogenesis of acute myeloid leukemia (AML) has been studied with the use of cytogenetic analysis for more than three decades. Recurrent chromosomal structural variations are well established as diagnostic and prognostic markers, suggesting that acquired genetic abnormalities (i.e., somatic mutations) have an essential role in pathogenesis. 1,2 However, nearly 50% of AML samples have a normal karyotype, and many of these genomes lack structural abnormalities, even when assessed with high-density comparative genomic hybridization or single-nucleotide polymorphism (SNP) arrays 3-5 (see Glossary). Targeted sequencing has identified recurrent mutations in FLT3, NPM1, KIT, CEBPA, and TET2. 6-8 Massively parallel sequencing enabled the discovery of recurrent mutations in DNMT3A 9,10 and IDH1. 11 Recent studies have shown that many patients with
- Published
- 2013
42. Distinct evolutionary trajectories of primary high-grade serous ovarian cancers revealed through spatial mutational profiling
- Author
-
Ali Bashashati, Stephen Yip, Kane Tse, Karey Shumansky, Jiarui Ding, Margaret Luk, Michael S. Anglesio, Nataliya Melnyk, Yongjun Zhao, Gavin Ha, Alicia A. Tone, David G. Huntsman, Sohrab P. Shah, Jamie Rosner, Thomas Zeng, Janine Senz, Leah M Prentice, Richard A. Moore, Blake Gilks, Andrew Roth, Melissa K. McConechy, Steve E. Kalloger, Jessica N. McAlpine, Winnie Yang, and Marco A. Marra
- Subjects
DNA Mutational Analysis ,Copy number analysis ,Drug Resistance ,Gene Dosage ,clonal evolution ,Biology ,Real-Time Polymerase Chain Reaction ,Somatic evolution in cancer ,Gene dosage ,Deep sequencing ,Pathology and Forensic Medicine ,03 medical and health sciences ,0302 clinical medicine ,Germline mutation ,high-grade serous ovarian cancer ,Humans ,Exome sequencing ,030304 developmental biology ,Aged ,Neoplasm Staging ,Genetics ,Ovarian Neoplasms ,0303 health sciences ,Gene Expression Profiling ,intratumoural heterogeneity ,Genetic Variation ,Middle Aged ,Original Papers ,3. Good health ,Clone Cells ,Cystadenocarcinoma, Serous ,Gene expression profiling ,Gene Expression Regulation, Neoplastic ,Serous fluid ,030220 oncology & carcinogenesis ,Disease Progression ,Female - Abstract
High-grade serous ovarian cancer (HGSC) is characterized by poor outcome, often attributed to the emergence of treatment-resistant subclones. We sought to measure the degree of genomic diversity within primary, untreated HGSCs to examine the natural state of tumour evolution prior to therapy. We performed exome sequencing, copy number analysis, targeted amplicon deep sequencing and gene expression profiling on 31 spatially and temporally separated HGSC tumour specimens (six patients), including ovarian masses, distant metastases and fallopian tube lesions. We found widespread intratumoural variation in mutation, copy number and gene expression profiles, with key driver alterations in genes present in only a subset of samples (eg PIK3CA, CTNNB1, NF1). On average, only 51.5% of mutations were present in every sample of a given case (range 10.2-91.4%), with TP53 as the only somatic mutation consistently present in all samples. Complex segmental aneuploidies, such as whole-genome doubling, were present in a subset of samples from the same individual, with divergent copy number changes segregating independently of point mutation acquisition. Reconstruction of evolutionary histories showed one patient with mixed HGSC and endometrioid histology, with common aetiologic origin in the fallopian tube and subsequent selection of different driver mutations in the histologically distinct samples. In this patient, we observed mixed cell populations in the early fallopian tube lesion, indicating that diversity arises at early stages of tumourigenesis. Our results revealed that HGSCs exhibit highly individual evolutionary trajectories and diverse genomic tapestries prior to therapy, exposing an essential biological characteristic to inform future design of personalized therapeutic solutions and investigation of drug-resistance mechanisms.
- Published
- 2013
43. Barnacle: detecting and characterizing tandem duplications and fusions in transcriptome assemblies
- Author
-
Shaun D. Jackman, S. Cenk Sahinalp, Inanc Birol, Jenny Q. Qian, Sherry Wang, Pamela A. Hoodless, Nina Thiessen, Jeremy Parker, Andrew J. Mungall, Lucas Swanson, Yaron S. Butterfield, Readman Chiu, Richard A. Moore, Anthony Raymond, Donna E. Hogge, Richard Varhol, Yongjun Zhao, Aly Karsan, Ka Ming Nip, Angela Tam, Richard Corbett, Deniz Yorukoglu, Sandy Sung, Gordon Robertson, T. Roderick Docking, Karen Mungall, Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science, and Yorukoglu, Deniz
- Subjects
Statistics as Topic ,RNA-Seq ,Genomics ,Breast Neoplasms ,Biology ,Proteomics ,Fusion gene ,03 medical and health sciences ,Chimera (genetics) ,0302 clinical medicine ,Gene Duplication ,Gene duplication ,Genetics ,Humans ,RNA, Messenger ,Chimeric transcripts ,Fusion ,Internal tandem duplication ,030304 developmental biology ,Partial tandem duplication ,0303 health sciences ,Transcriptome assembly ,Methodology Article ,Gene Expression Profiling ,Molecular Sequence Annotation ,Exons ,3. Good health ,Gene expression profiling ,Leukemia, Myeloid, Acute ,PTD ,ITD ,030220 oncology & carcinogenesis ,DNA microarray ,RNA-seq ,Gene Fusion ,Transcriptome ,Biotechnology - Abstract
Background: Chimeric transcripts, including partial and internal tandem duplications (PTDs, ITDs) and gene fusions, are important in the detection, prognosis, and treatment of human cancers. Results: We describe Barnacle, a production-grade analysis tool that detects such chimeras in de novo assemblies of RNA-seq data, and supports prioritizing them for review and validation by reporting the relative coverage of co-occurring chimeric and wild-type transcripts. We demonstrate applications in large-scale disease studies, by identifying PTDs in MLL, ITDs in FLT3, and reciprocal fusions between PML and RARA, in two deeply sequenced acute myeloid leukemia (AML) RNA-seq datasets. Conclusions: Our analyses of real and simulated data sets show that, with appropriate filter settings, Barnacle makes highly specific predictions for three types of chimeric transcripts that are important in a range of cancers: PTDs, ITDs, and fusions. High specificity makes manual review and validation efficient, which is necessary in large-scale disease studies. Characterizing an extended range of chimera types will help generate insights into progression, treatment, and outcomes for complex diseases., Simon Fraser University. Bioinformatics for Combating Infectious Disease Project, Simon Fraser University (Graduate Fellowship), Pacific Century Institute (Graduate Scholarship), Genome Canada (Firm), Canadian Institutes of Health Research, Genome British Columbia (Firm) (Grant #121AML), Provincial Health Services Authority (British Columbia, Canada), BC Cancer Foundation
- Published
- 2013
44. DNA template strand sequencing of single-cells maps genomic rearrangements at high resolution
- Author
-
Steven S.S. Poon, Peter M. Lansdorp, Mark Hills, Ashley D. Sanders, Ulrike Naumann, Martin Hirst, Ester Falconer, Elizabeth A. Chavez, Yongjun Zhao, Faculteit Medische Wetenschappen/UMCG, and Damage and Repair in Cancer Development and Cancer Treatment (DARE)
- Subjects
INSTABILITY ,RECOMBINATION ,Genomics ,Sister chromatid exchange ,Biology ,MOUSE ,Biochemistry ,Genome ,Article ,SISTER-CHROMATID EXCHANGE ,03 medical and health sciences ,Mice ,0302 clinical medicine ,MAMMALIAN-CELLS ,Sister chromatids ,Animals ,Molecular Biology ,Cells, Cultured ,030304 developmental biology ,Sequence (medicine) ,Genetics ,0303 health sciences ,Contig ,IDENTIFICATION ,Cell Biology ,Sequence Analysis, DNA ,Templates, Genetic ,CANCER ,EVOLUTION ,EMBRYONIC STEM-CELLS ,Sister Chromatid Exchange ,030217 neurology & neurosurgery ,Recombination ,Biotechnology ,Reference genome - Abstract
DNA rearrangements such as sister chromatid exchanges (SCEs) are sensitive indicators of genomic stress and instability, but they are typically masked by single-cell sequencing techniques. We developed Strand-seq to independently sequence parental DNA template strands from single cells, making it possible to map SCEs at orders-of-magnitude greater resolution than was previously possible. On average, murine embryonic stem (mES) cells exhibit eight SCEs, which are detected at a resolution of up to 23 bp. Strikingly, Strand-seq of 62 single mES cells predicts that the mm9 mouse reference genome assembly contains at least 17 incorrectly oriented segments totaling nearly 1% of the genome. These misoriented contigs and fragments have persisted through several iterations of the mouse reference genome and have been difficult to detect using conventional sequencing techniques. The ability to map SCE events at high resolution and fine-tune reference genomes by Strand-seq dramatically expands the scope of single-cell sequencing.
- Published
- 2012
45. Abstract 5226: Genomic analysis of pancreatic ductal adenocarcinoma in a patient with MUTYH-associated polyposis
- Author
-
Peter Eirew, Daniel J. Renouf, Robert A. Holt, Janessa Laskin, Richard A. Moore, Stephen Yip, Jacquie Schein, Robyn Roscoe, David F. Schaeffer, Marco A. Marra, Yaoqing Shen, Carol Cremin, Aly Karsan, Alexandra Fok, Steven J.M. Jones, Kasmintan A. Schrader, Martin Jones, Gillian Mitchell, Andrew J. Mungall, Carolyn Chu’ng, Yongjun Zhao, Joanna M. Karasinska, Eric Y. Zhao, Hui-li Wong, Tom Thomson, Howard John Lim, Yussanne Ma, and Sean D. Young
- Subjects
Loss of heterozygosity ,Genetics ,Cancer Research ,Cancer prevention ,Oncology ,MUTYH ,DNA repair ,MUTYH-Associated Polyposis ,Cancer research ,Base excision repair ,Biology ,Gene ,Germline - Abstract
Biallelic pathogenic germline variants in the DNA repair glycosylase, MUTYH, cause MUTYH-associated polyposis, characterised by an increased susceptibility to colorectal adenomas and carcinomas secondary to defective base excision repair. We report a patient diagnosed with Stage IIB distal pancreatic ductal adenocarcinoma (PDAC) at the age of 45 years. Prior colonoscopy and gastroscopy noted three colonic tubular adenomas and a gastric fundic gland polyp. The patient was consented to whole genome and transcriptome sequencing of the PDAC and matched normal blood DNA through the British Columbia Personalized Onco-Genomics (POG) program. Analysis of germline and somatic variants including single nucleotide variants, copy number determination, loss of heterozygosity detection and mutational signatures was undertaken. Expression fold-changes were calculated against Illumina BodyMap pancreatic tissue averages and compared against The Cancer Genome Atlas PDAC cases. Germline analysis revealed biallelic mutations in the MUTYH gene. In light of this patient's personal and family history of adenomatous colon polyps, clinic-initiated panel testing of 14 cancer susceptibility genes, including MUTYH, via Illumina sequencing with reflex Sanger confirmation revealed the same biallelic MUTYH changes. Analysis of the patient's PDAC revealed a base excision repair pathway signature, demonstrated by an increased frequency of C:G>A:T transversions, consistent with deficient MUTYH activity. This is the first association of germline MUTYH biallelic pathogenic variants with PDAC and provides evidence of the contribution of aberrant MUTYH function to the genomic landscape of a PDAC. Detection of the base excision repair mutational signature may be a sensitive way to screen tumors for aberrant MUTYH function that can reveal potential germline MUTYH-related cancer susceptibility, and allow inference of pathogenicity of detected MUTYH variants, which may have cancer prevention and therapeutic implications. Citation Format: Kasmintan A. Schrader, Carolyn Chu’ng, Eric Zhao, Hui-li Wong, Yaoqing Shen, Martin Jones, Tom Thomson, Howard Lim, Sean Young, Carol Cremin, Robert Holt, Peter Eirew, Joanna Karasinska, Jacquie Schein, Yongjun Zhao, Andy Mungall, Richard Moore, Yussanne Ma, Alexandra Fok, Robyn Roscoe, Stephen Yip, Gillian Mitchell, Aly Karsan, Steven Jones, David Schaeffer, Janessa Laskin, Marco Marra, Daniel Renouf. Genomic analysis of pancreatic ductal adenocarcinoma in a patient with MUTYH-associated polyposis. [abstract]. In: Proceedings of the 107th Annual Meeting of the American Association for Cancer Research; 2016 Apr 16-20; New Orleans, LA. Philadelphia (PA): AACR; Cancer Res 2016;76(14 Suppl):Abstract nr 5226.
- Published
- 2016
46. Molecular etiology of an indolent lymphoproliferative disorder determined by whole-genome sequencing
- Author
-
Janessa Laskin, Aly Karsan, Kerry J. Savage, Marco A. Marra, Joanna Wegrzyn-Woltosz, Jacqueline E. Schein, Andrew P. Weng, Steven J.M. Jones, Randy D. Gascoyne, Richard A. Moore, Yongjun Zhao, Yaoqing Shen, Yvonne Y. Li, Jeremy Parker, and Erin Pleasance
- Subjects
Research Report ,Whole genome sequencing ,Genetics ,Mutation ,Lymphocytosis ,Chronic lymphocytic leukemia ,General Medicine ,lymphocytosis ,Biology ,medicine.disease_cause ,medicine.disease ,DNA sequencing ,Lymphoma ,Gene expression profiling ,increase in B cell number ,immune system diseases ,hemic and lymphatic diseases ,Gene expression ,medicine ,medicine.symptom - Abstract
In an attempt to assess potential treatment options, whole-genome and transcriptome sequencing were performed on a patient with an unclassifiable small lymphoproliferative disorder. Variants from genome sequencing were prioritized using a combination of comparative variant distributions in a spectrum of lymphomas, and meta-analyses of gene expression profiling. In this patient, the molecular variants that we believe to be most relevant to the disease presentation most strongly resemble a diffuse large B-cell lymphoma (DLBCL), whereas the gene expression data are most consistent with a low-grade chronic lymphocytic leukemia (CLL). The variant of greatest interest was a predicted NOTCH2-truncating mutation, which has been recently reported in various lymphomas.
- Published
- 2016
47. The clonal and mutational evolution spectrum of primary triple-negative breast cancers
- Author
-
Jiarui Ding, Philippe Gascard, Samuel Aparicio, Mahvash Sigaroudinia, Annie Moradian, Gavin Ha, Angela Burleigh, Oscar M. Rueda, Sambasivarao Damaraju, Paul D.P. Pharoah, Christina Curtis, Leah M Prentice, Peter H. Watson, Damian Yap, Suet-Feung Chin, Rodrigo Goya, Kelly Hoon, Inanc Birol, Sohrab P. Shah, Connie J. Eaves, Karen A. Gelmon, Steven J.M. Jones, Noreen Dhalla, Gulisa Turashvili, Andrew Roth, Yongjun Zhao, Alireza Heravi-Moussavi, Irmtraud M. Meyer, Gregg B. Morin, Ali Bashashati, Anamaria Crisan, Richard Varhol, John R. Mackey, Joseph F. Costello, Carlos Caldas, Jaswinder Khattra, S.-W. Grace Cheng, Timothy T. Harkins, Simon K. Chan, Jamie Rosner, Daniel Lai, Arusha Oloumi, Kane Tse, David G. Huntsman, Malachi Griffith, Vasisht Tadigotla, Stephen Chia, Ryan Giuliany, Virginie Bernard, Kevin C. Ma, Thea D. Tlsty, Martin Hirst, Karey Shumansky, Andrew McPherson, Thomas Zeng, Wyeth W. Wasserman, Angela Tam, Gholamreza Haffari, and Marco A. Marra
- Subjects
Genotype ,DNA Copy Number Variations ,Somatic cell ,Evolution ,General Science & Technology ,Population ,DNA Mutational Analysis ,Breast Neoplasms ,Biology ,medicine.disease_cause ,Article ,Evolution, Molecular ,Viewpoint ,INDEL Mutation ,Breast Cancer ,medicine ,PTEN ,Humans ,Point Mutation ,Allele ,Precision Medicine ,education ,Alleles ,Cancer ,Genetics ,Mutation ,education.field_of_study ,Neoplastic ,screening and diagnosis ,Multidisciplinary ,Sequence Analysis, RNA ,Point mutation ,Gene Expression Profiling ,Reproducibility of Results ,Molecular ,High-Throughput Nucleotide Sequencing ,Clone Cells ,4.1 Discovery and preclinical testing of markers and technologies ,Gene expression profiling ,Gene Expression Regulation, Neoplastic ,Detection ,Gene Expression Regulation ,biology.protein ,Disease Progression ,RNA ,Female ,Sequence Analysis - Abstract
Primary triple negative breast cancers (TNBC) represent approximately 16% of all breast cancers1 and are a tumour type defined by exclusion, for which comprehensive landscapes of somatic mutation have not been determined. Here we show in 104 early TNBC cases, that at the time of diagnosis these cancers exhibit a wide and continuous spectrum of genomic evolution, with some exhibiting only a handful of somatic aberrations in a few pathways, whereas others contain hundreds of somatic events and multiple pathways implicated. Integration with matched whole transcriptome sequence data revealed that only ~36% of mutations are expressed. By examining single nucleotide variant (SNV) allelic abundance derived from deep re-sequencing (median >20,000 fold) measurements in 2414 somatic mutations, we determine for the first time in an epithelial tumour, the relative abundance of clonal genotypes among cases in the population. We show that TNBC vary widely and continuously in their clonal frequencies at the time of diagnosis, with basal subtype TNBC2,3 exhibiting more variation than non-basal TNBC. Although p53 and PIK3CA/PTEN somatic mutations appear clonally dominant compared with other pathways, in some tumours their clonal frequencies are incompatible with founder status. Mutations in cytoskeletal and cell shape/motility proteins occurred at lower clonal frequencies, suggesting they occurred later during tumour progression. Taken together our results show that future attempts to dissect the biology and therapeutic responses of TNBC will require the determination of individual tumour clonal genotypes.
- Published
- 2012
48. The Sensitivity of Massively Parallel Sequencing for Detecting Candidate Infectious Agents Associated with Human Tissue
- Author
-
Richard A. Moore, René L. Warren, Yongjun Zhao, Jan M. Friedman, Caroline Chénard, Curtis A. Suttle, Robert A. Holt, Julia A. Gustavsen, and J. Douglas Freeman
- Subjects
Biophysics ,lcsh:Medicine ,Sequence alignment ,Biology ,Infections ,Biochemistry ,Microbiology ,Models, Biological ,Sensitivity and Specificity ,Massively parallel signature sequencing ,Transcriptomes ,Transcriptome ,Data sequences ,DNA amplification ,Genome Analysis Tools ,Limit of Detection ,Virology ,Humans ,RNA Viruses ,Genomic library ,Genome Sequencing ,RNA, Messenger ,lcsh:Science ,Genetics ,Multidisciplinary ,Massive parallel sequencing ,Base Sequence ,cDNA library ,lcsh:R ,Computational Biology ,High-Throughput Nucleotide Sequencing ,DNA ,Genomics ,Highly sensitive ,Nucleic acids ,Viral Disease Diagnosis ,Infectious Diseases ,Medicine ,RNA, Viral ,lcsh:Q ,Metagenomics ,Genome Expression Analysis ,Research Article - Abstract
Massively parallel sequencing technology now provides the opportunity to sample the transcriptome of a given tissue comprehensively. Transcripts at only a few copies per cell are readily detectable, allowing the discovery of low abundance viral and bacterial transcripts in human tissue samples. Here we describe an approach for mining large sequence data sets for the presence of microbial sequences. Further, we demonstrate the sensitivity of this approach by sequencing human RNA-seq libraries spiked with decreasing amounts of an RNA-virus. At a modest depth of sequencing, viral transcripts can be detected at frequencies less than 1 in 1,000,000. With current sequencing platforms approaching outputs of one billion reads per run, this is a highly sensitive method for detecting putative infectious agents associated with human tissues.
- Published
- 2011
49. Whole-genome sequencing and social-network analysis of a tuberculosis outbreak
- Author
-
Victoria J. Cook, Richard D. Moore, Meenu K. Sharma, Robert A. Holt, Jennifer L. Gardy, Marcus Lem, Patrick Tang, James C. Johnston, Richard Varhol, Kevin Elwood, Fiona S. L. Brinkman, Shannan J. Ho Sui, Steven J.M. Jones, Yongjun Zhao, Robert C. Brunham, Inanc Birol, S Rempel, Elizabeth Brodkin, and Lena Shah
- Subjects
Adult ,Male ,Tuberculosis ,Genotype ,Genome ,Polymorphism, Single Nucleotide ,Disease Outbreaks ,Mycobacterium tuberculosis ,Cocaine-Related Disorders ,Young Adult ,Risk Factors ,Surveys and Questionnaires ,Medicine ,Humans ,Genotyping ,Phylogeny ,Whole genome sequencing ,Genetics ,biology ,British Columbia ,business.industry ,Incidence ,Outbreak ,Social Support ,General Medicine ,Sequence Analysis, DNA ,Middle Aged ,medicine.disease ,biology.organism_classification ,Female ,Contact Tracing ,business ,Contact tracing ,Genome, Bacterial - Abstract
An outbreak of tuberculosis occurred over a 3-year period in a medium-size community in British Columbia, Canada. The results of mycobacterial interspersed repetitive unit-variable-number tandem-repeat (MIRU-VNTR) genotyping suggested the outbreak was clonal. Traditional contact tracing did not identify a source. We used whole-genome sequencing and social-network analysis in an effort to describe the outbreak dynamics at a higher resolution.We sequenced the complete genomes of 32 Mycobacterium tuberculosis outbreak isolates and 4 historical isolates (from the same region but sampled before the outbreak) with matching genotypes, using short-read sequencing. Epidemiologic and genomic data were overlaid on a social network constructed by means of interviews with patients to determine the origins and transmission dynamics of the outbreak.Whole-genome data revealed two genetically distinct lineages of M. tuberculosis with identical MIRU-VNTR genotypes, suggesting two concomitant outbreaks. Integration of social-network and phylogenetic analyses revealed several transmission events, including those involving "superspreaders." Both lineages descended from a common ancestor and had been detected in the community before the outbreak, suggesting a social, rather than genetic, trigger. Further epidemiologic investigation revealed that the onset of the outbreak coincided with a recorded increase in crack cocaine use in the community.Through integration of large-scale bacterial whole-genome sequencing and social-network analysis, we show that a socioenvironmental factor--most likely increased crack cocaine use--triggered the simultaneous expansion of two extant lineages of M. tuberculosis that was sustained by key members of a high-risk social network. Genotyping and contact tracing alone did not capture the true dynamics of the outbreak. (Funded by Genome British Columbia and others.).
- Published
- 2011
50. Characterization of the contradictory chromatin signatures at the 3' exons of zinc finger genes
- Author
-
Lorigail Echipare, Joseph F. Costello, Lei Dou, Martin Hirst, Marco A. Marra, Yongjun Zhao, Kimberly R. Blahnik, Peggy J. Farnham, Sushma Iyengar, Erica Sanchez, Ian F Korf, Henriette O'Geen, and Wutz, Anton
- Subjects
Epigenomics ,Gene Expression ,Transcriptomes ,Histones ,Exon ,0302 clinical medicine ,Molecular Cell Biology ,Genome Sequencing ,Genetics ,Zinc finger ,0303 health sciences ,Multidisciplinary ,Chromosome Biology ,Histone Modification ,Zinc Fingers ,Genomics ,Exons ,Chromatin ,Functional Genomics ,Histone ,030220 oncology & carcinogenesis ,Medicine ,Epigenetics ,Tandem exon duplication ,Research Article ,Chromosome Structure and Function ,General Science & Technology ,1.1 Normal biological development and functioning ,Science ,Biology ,Methylation ,Molecular Genetics ,03 medical and health sciences ,Genomic Imprinting ,Underpinning research ,Genome Analysis Tools ,Humans ,Gene Regulation ,Gene ,030304 developmental biology ,Lysine ,Human Genome ,Computational Biology ,Promoter ,Human Genetics ,HEK293 Cells ,biology.protein ,Structural Genomics ,Generic health relevance ,Genomic imprinting - Abstract
The H3K9me3 histone modification is often found at promoter regions, where it functions to repress transcription. However, we have previously shown that 3′ exons of zinc finger genes (ZNFs) are marked by high levels of H3K9me3. We have now further investigated this unusual location for H3K9me3 in ZNF genes. Neither bioinformatic nor experimental approaches support the hypothesis that the 3′ exons of ZNFs are promoters. We further characterized the histone modifications at the 3′ ZNF exons and found that these regions also contain H3K36me3, a mark of transcriptional elongation. A genome-wide analysis of ChIP-seq data revealed that ZNFs constitute the majority of genes that have high levels of both H3K9me3 and H3K36me3. These results suggested the possibility that the ZNF genes may be imprinted, with one allele transcribed and one allele repressed. To test the hypothesis that the contradictory modifications are due to imprinting, we used a SNP analysis of RNA-seq data to demonstrate that both alleles of certain ZNF genes having H3K9me3 and H3K36me3 are transcribed. We next analyzed isolated ZNF 3′ exons using stably integrated episomes. We found that although the H3K36me3 mark was lost when the 3′ ZNF exon was removed from its natural genomic location, the isolated ZNF 3′ exons retained the H3K9me3 mark. Thus, the H3K9me3 mark at ZNF 3′ exons does not impede transcription and it is regulated independently of the H3K36me3 mark. Finally, we demonstrate a strong relationship between the number of tandemly repeated domains in the 3′ exons and the H3K9me3 mark. We suggest that the H3K9me3 at ZNF 3′ exons may function to protect the genome from inappropriate recombination rather than to regulate transcription.
- Published
- 2011
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.