27 results on '"T Roderick Docking"'
Search Results
2. Control of focal adhesion kinase activation by RUNX1-regulated miRNAs in high-risk AML
- Author
-
Vijay Suresh Akhade, Tian Liu, T. Roderick Docking, Jihong Jiang, Aparna Gopal, and Aly Karsan
- Subjects
Cancer Research ,Oncology ,Hematology - Published
- 2023
- Full Text
- View/download PDF
3. KLEAT: Cleavage Site Analysis of Transcriptomes.
- Author
-
Inanç Birol, Anthony Raymond, Readman Chiu, Ka Ming Nip, Shaun D. Jackman, Maayan Kreitzman, T. Roderick Docking, Catherine A. Ennis, A. Gordon Robertson, and Aly Karsan
- Published
- 2015
4. Loss of FBXO11 function establishes a stem cell program in acute myeloid leukemia through dysregulation of the mitochondrial protease LONP1
- Author
-
Angela Ya-Chi Mo, Hayle Kincross, Xuan Wang, Linda Ya-Ting Chang, Gerben Duns, Harwood Kwan, Tammy Lau, T. Roderick Docking, Jessica Tran, Shane Colborne, Se-Wing Grace Cheng, Shujun Huang, Nadia Gharaee, Elijah Willie, Jihong Jiang, Jeremy Parker, Joshua Bridgers, Davis Wood, Ramon Klein Geltink, Gregg B. Morin, and Aly Karsan
- Abstract
Acute myeloid leukemia (AML) is an aggressive cancer with very poor outcomes. Analysis of sequencing data from 1,727 unique AML patients revealed frequent mutations in ubiquitin ligase family genes. Loss of function of the Skp1/Cul1/Fbox (SCF) E3 ubiquitin ligase complex genes are found in 8 - 9% of adult AML patients including recurrent mutations in FBXO11. FBXO11 is the most significantly downregulated gene of the SCF complex in AML samples. Depletion of Fbxo11 promotes myeloid-biased stem cell maintenance and cooperates with AML1-ETO and mutant KRAS to generate serially transplantable mouse and human AML in in vivo models. FBXO11 mediates K63-linked polyubiquitination of the LONP1 mitochondrial protease, and loss of FBXO11 impairs LONP1 activity thereby reducing mitochondrial membrane potential, imparting stem cell properties and driving leukemogenesis. Our findings suggest that loss of FBXO11 function primes HSPC for myeloid-biased self-renewal through attenuation of LONP1-mediated regulation of mitochondrial function.
- Published
- 2022
- Full Text
- View/download PDF
5. Altered microRNA expression links IL6 and TNF-induced inflammaging with myeloid malignancy in humans and mice
- Author
-
Aparna Gopal, Megan Fuller, Yu Deng, Mark Boldin, Jennifer M Grants, Aly Karsan, Joanna Wegrzyn, David J.H.F. Knapp, Connie J. Eaves, T. Roderick Docking, Tony Hui, Jenny Li, Marion Shadbolt, Kieran O'Neill, Jeremy Parker, and Martin Hirst
- Subjects
Adult ,Male ,Aging ,Myeloid ,Adolescent ,Immunology ,Inflammation ,Biochemistry ,Mice ,Young Adult ,microRNA ,medicine ,Animals ,Humans ,Cell Self Renewal ,Interleukin 6 ,Cellular Senescence ,Aged ,Mice, Knockout ,biology ,Interleukin-6 ,Tumor Necrosis Factor-alpha ,NF-kappa B ,Hematopoietic stem cell ,Cell Differentiation ,Cell Biology ,Hematology ,DNA Methylation ,Middle Aged ,Hematopoietic Stem Cells ,Leukemia, Myeloid, Acute ,MicroRNAs ,Haematopoiesis ,medicine.anatomical_structure ,DNA methylation ,biology.protein ,Cancer research ,Cytokines ,Female ,Tumor necrosis factor alpha ,Single-Cell Analysis ,medicine.symptom ,Transcriptome ,BLOOD Commentary - Abstract
Aging is associated with significant changes in the hematopoietic system, including increased inflammation, impaired hematopoietic stem cell (HSC) function, and increased incidence of myeloid malignancy. Inflammation of aging (“inflammaging”) has been proposed as a driver of age-related changes in HSC function and myeloid malignancy, but mechanisms linking these phenomena remain poorly defined. We identified loss of miR-146a as driving aging-associated inflammation in AML patients. miR-146a expression declined in old wild-type mice, and loss of miR-146a promoted premature HSC aging and inflammation in young miR-146a–null mice, preceding development of aging-associated myeloid malignancy. Using single-cell assays of HSC quiescence, stemness, differentiation potential, and epigenetic state to probe HSC function and population structure, we found that loss of miR-146a depleted a subpopulation of primitive, quiescent HSCs. DNA methylation and transcriptome profiling implicated NF-κB, IL6, and TNF as potential drivers of HSC dysfunction, activating an inflammatory signaling relay promoting IL6 and TNF secretion from mature miR-146a−/− myeloid and lymphoid cells. Reducing inflammation by targeting Il6 or Tnf was sufficient to restore single-cell measures of miR-146a−/− HSC function and subpopulation structure and reduced the incidence of hematological malignancy in miR-146a−/− mice. miR-146a−/− HSCs exhibited enhanced sensitivity to IL6 stimulation, indicating that loss of miR-146a affects HSC function via both cell-extrinsic inflammatory signals and increased cell-intrinsic sensitivity to inflammation. Thus, loss of miR-146a regulates cell-extrinsic and -intrinsic mechanisms linking HSC inflammaging to the development of myeloid malignancy.
- Published
- 2020
- Full Text
- View/download PDF
6. Loss of lenalidomide-induced megakaryocytic differentiation leads to therapy resistance in del(5q) myelodysplastic syndrome
- Author
-
Austin G. Kulasekararaj, Martin Jädersten, Patricia Umlandt, Jihong Jiang, Aly Karsan, Sergio Martinez-Høyer, Megan Fuller, Uwe Platzbecker, Nadia Gharaee, Luca Malcovati, T. Roderick Docking, Eva Hellström-Lindberg, Yu Deng, Angela Mo, Alan F. List, Jenny Li, Jeremy Parker, and Hematology
- Subjects
0303 health sciences ,Mutation ,Megakaryocyte differentiation ,Myelodysplastic syndromes ,GATA2 ,Context (language use) ,Cell Biology ,Biology ,medicine.disease ,medicine.disease_cause ,3. Good health ,Cell biology ,03 medical and health sciences ,chemistry.chemical_compound ,0302 clinical medicine ,Downregulation and upregulation ,RUNX1 ,chemistry ,hemic and lymphatic diseases ,030220 oncology & carcinogenesis ,medicine ,Cancer research ,030304 developmental biology ,Lenalidomide ,medicine.drug - Abstract
Interstitial deletion of the long arm of chromosome 5 (del(5q)) is the most common structural genomic variant in myelodysplastic syndromes (MDS)1. Lenalidomide (LEN) is the treatment of choice for patients with del(5q) MDS, but half of the responding patients become resistant2 within 2 years. TP53 mutations are detected in ~20% of LEN-resistant patients3. Here we show that patients who become resistant to LEN harbour recurrent variants of TP53 or RUNX1. LEN upregulated RUNX1 protein and function in a CRBN- and TP53-dependent manner in del(5q) cells, and mutation or downregulation of RUNX1 rendered cells resistant to LEN. LEN induced megakaryocytic differentiation of del(5q) cells followed by cell death that was dependent on calpain activation and CSNK1A1 degradation4,5. We also identified GATA2 as a LEN-responsive gene that is required for LEN-induced megakaryocyte differentiation. Megakaryocytic gene-promoter analyses suggested that LEN-induced degradation of IKZF1 enables a RUNX1-GATA2 complex to drive megakaryocytic differentiation. Overexpression of GATA2 restored LEN sensitivity in the context of RUNX1 or TP53 mutations by enhancing LEN-induced megakaryocytic differentiation. Screening for mutations that block LEN-induced megakaryocytic differentiation should identify patients who are resistant to LEN.
- Published
- 2020
- Full Text
- View/download PDF
7. Genomic testing in myeloid malignancy
- Author
-
Aly Karsan and T. Roderick Docking
- Subjects
Myeloid Malignancy ,medicine.medical_specialty ,Myeloid ,Clinical Biochemistry ,Disease ,Computational biology ,030204 cardiovascular system & hematology ,Biology ,DNA sequencing ,03 medical and health sciences ,0302 clinical medicine ,Biomarkers, Tumor ,medicine ,Clinical genetic ,Humans ,Genetic Predisposition to Disease ,Genetic Testing ,Genetic testing ,Myeloproliferative Disorders ,medicine.diagnostic_test ,business.industry ,Biochemistry (medical) ,Cytogenetics ,Genomics ,Sequence Analysis, DNA ,Hematology ,General Medicine ,medicine.anatomical_structure ,Hematologic Neoplasms ,Personalized medicine ,business ,030215 immunology - Abstract
Clinical genetic testing in the myeloid malignancies is undergoing a rapid transition from the era of cytogenetics and single-gene testing to an era dominated by next-generation sequencing (NGS). This transition promises to better reveal the genetic alterations underlying disease, but there are distinct risks and benefits associated with different NGS testing platforms. NGS offers the potential benefit of being able to survey alterations across a wider set of genes, but analytic and clinical challenges associated with incidental findings, germ line variation, turnaround time, and limits of detection must be addressed. Additionally, transcriptome-based testing may offer several distinct benefits beyond traditional DNA-based methods. In addition to testing at disease diagnosis, research indicates potential benefits of genetic testing both prior to disease onset and at remission. In this review, we discuss the transition from the era of cytogenetics and single-gene tests to the era of NGS panels and genome-wide sequencing-highlighting both the potential and drawbacks of these novel technologies.
- Published
- 2019
- Full Text
- View/download PDF
8. Additional file 4 of Gene discovery for the bark beetle-vectored fungal tree pathogen Grosmannia clavigera
- Author
-
Hesse-Orce, Uljana, DiGuistini, Scott, Keeling, Christopher I, Wang, Ye, Li, Maria, Henderson, Hannah, T Roderick Docking, Liao, Nancy Y, Robertson, Gordon, Holt, Robert A, Jones, Steven JM, Bohlmann, Jörg, and Breuil, Colette
- Subjects
Data_FILES - Abstract
Authors’ original file for figure 3
- Published
- 2021
- Full Text
- View/download PDF
9. Additional file 2 of Gene discovery for the bark beetle-vectored fungal tree pathogen Grosmannia clavigera
- Author
-
Hesse-Orce, Uljana, DiGuistini, Scott, Keeling, Christopher I, Wang, Ye, Li, Maria, Henderson, Hannah, T Roderick Docking, Liao, Nancy Y, Robertson, Gordon, Holt, Robert A, Jones, Steven JM, Bohlmann, Jörg, and Breuil, Colette
- Subjects
Data_FILES - Abstract
Authors’ original file for figure 1
- Published
- 2021
- Full Text
- View/download PDF
10. Additional file of Gene discovery for the bark beetle-vectored fungal tree pathogen Grosmannia clavigera
- Author
-
Hesse-Orce, Uljana, DiGuistini, Scott, Keeling, Christopher I, Wang, Ye, Li, Maria, Henderson, Hannah, T Roderick Docking, Liao, Nancy Y, Robertson, Gordon, Holt, Robert A, Jones, Steven JM, Bohlmann, Jörg, and Breuil, Colette
- Subjects
complex mixtures - Abstract
Additional file of Gene discovery for the bark beetle-vectored fungal tree pathogen Grosmannia clavigera
- Published
- 2021
- Full Text
- View/download PDF
11. Additional file 5 of Gene discovery for the bark beetle-vectored fungal tree pathogen Grosmannia clavigera
- Author
-
Hesse-Orce, Uljana, DiGuistini, Scott, Keeling, Christopher I, Wang, Ye, Li, Maria, Henderson, Hannah, T Roderick Docking, Liao, Nancy Y, Robertson, Gordon, Holt, Robert A, Jones, Steven JM, Bohlmann, Jörg, and Breuil, Colette
- Subjects
Data_FILES - Abstract
Authors’ original file for figure 4
- Published
- 2021
- Full Text
- View/download PDF
12. Additional file 3 of Gene discovery for the bark beetle-vectored fungal tree pathogen Grosmannia clavigera
- Author
-
Hesse-Orce, Uljana, DiGuistini, Scott, Keeling, Christopher I, Wang, Ye, Li, Maria, Henderson, Hannah, T Roderick Docking, Liao, Nancy Y, Robertson, Gordon, Holt, Robert A, Jones, Steven JM, Bohlmann, Jörg, and Breuil, Colette
- Subjects
Data_FILES - Abstract
Authors’ original file for figure 2
- Published
- 2021
- Full Text
- View/download PDF
13. Applications of Bayesian network models in predicting types of hematological malignancies
- Author
-
T. Roderick Docking, Aly Karsan, Amir Foroushani, Rupesh Agrahari, Monika Hudoba, Gerben Duns, Habil Zare, and Linda Chang
- Subjects
0301 basic medicine ,Computer science ,Gene regulatory network ,lcsh:Medicine ,Machine learning ,computer.software_genre ,Article ,Bioconductor ,03 medical and health sciences ,Bayes' theorem ,Gene expression ,Humans ,Gene Regulatory Networks ,lcsh:Science ,Gene ,Multidisciplinary ,business.industry ,Microarray analysis techniques ,Sequence Analysis, RNA ,lcsh:R ,Bayesian network ,Bayes Theorem ,3. Good health ,Gene expression profiling ,Gene Expression Regulation, Neoplastic ,Leukemia, Myeloid, Acute ,030104 developmental biology ,Hematologic Neoplasms ,Myelodysplastic Syndromes ,lcsh:Q ,Artificial intelligence ,business ,Transcriptome ,Classifier (UML) ,computer - Abstract
Network analysis is the preferred approach for the detection of subtle but coordinated changes in expression of an interacting and related set of genes. We introduce a novel method based on the analyses of coexpression networks and Bayesian networks, and we use this new method to classify two types of hematological malignancies; namely, acute myeloid leukemia (AML) and myelodysplastic syndrome (MDS). Our classifier has an accuracy of 93%, a precision of 98%, and a recall of 90% on the training dataset (n = 366); which outperforms the results reported by other scholars on the same dataset. Although our training dataset consists of microarray data, our model has a remarkable performance on the RNA-Seq test dataset (n = 74, accuracy = 89%, precision = 88%, recall = 98%), which confirms that eigengenes are robust with respect to expression profiling technology. These signatures are useful in classification and correctly predicting the diagnosis. They might also provide valuable information about the underlying biology of diseases. Our network analysis approach is generalizable and can be useful for classifying other diseases based on gene expression profiles. Our previously published Pigengene package is publicly available through Bioconductor, which can be used to conveniently fit a Bayesian network to gene expression data.
- Published
- 2018
- Full Text
- View/download PDF
14. Assessing Limit of Detection in Clinical Sequencing
- Author
-
Sarah A. Munro, Ian Bosdet, Lucas Swanson, Richard A. Moore, T. Roderick Docking, Elizabeth Starks, and Aly Karsan
- Subjects
0301 basic medicine ,Computer science ,Computational biology ,Polymerase Chain Reaction ,Polymorphism, Single Nucleotide ,Sensitivity and Specificity ,Deep sequencing ,Pathology and Forensic Medicine ,03 medical and health sciences ,0302 clinical medicine ,Limit of Detection ,Neoplasms ,Humans ,Sensitivity (control systems) ,Solid tumor ,Alleles ,Models, Statistical ,High-Throughput Nucleotide Sequencing ,Reproducibility of Results ,Variant allele ,Limiting ,Assay sensitivity ,DNA ,Genomics ,030104 developmental biology ,030220 oncology & carcinogenesis ,Mutation ,Molecular Medicine - Abstract
Clinical reporting of solid tumor sequencing requires reliable assessment of the accuracy and reproducibility of each assay. Somatic mutation variant allele fractions may be below 10% in many samples due to sample heterogeneity, tumor clonality, and/or sample degradation in fixatives such as formalin. The toolkits available to the clinical sequencing community for correlating assay design parameters with assay sensitivity remain limited, and large-scale empirical assessments are often relied upon due to the lack of clear theoretical grounding. To address this uncertainty, a theoretical model was developed for predicting the expected variant calling sensitivity for a given library complexity and sequencing depth. Binomial models were found to be appropriate when assay sensitivity was only limited by library complexity or sequencing depth, but functional scaling for library complexity was necessary when both library complexity and sequencing depth were co-limiting. This model was empirically validated with sequencing experiments by using a series of DNA input amounts and sequencing depths. Based on these findings, a workflow is proposed for determining the limiting factors to sensitivity in different assay designs, and the formulas for these scenarios are presented. The approach described here provides designers of clinical assays with the methods to theoretically predict assay design outcomes a priori, potentially reducing burden in clinical tumor assay design and validation efforts.
- Published
- 2019
15. Sample Tracking Using Unique Sequence Controls
- Author
-
Richard A. Moore, Irene Li, Robert A. Holt, Lucas Swanson, Ian Bosdet, Elizabeth Starks, Andrew J. Mungall, Aly Karsan, Thomas Zeng, Sarah A. Munro, Kane Tse, T. Roderick Docking, and Yaron S.N. Butterfield
- Subjects
0301 basic medicine ,Computer science ,Sequence analysis ,Pipeline (computing) ,Sample (statistics) ,Computational biology ,DNA sequencing ,Pathology and Forensic Medicine ,03 medical and health sciences ,0302 clinical medicine ,Humans ,Genomic library ,Genotyping ,Sequence (medicine) ,Gene Library ,Computational Biology ,High-Throughput Nucleotide Sequencing ,Reproducibility of Results ,Sequence Analysis, DNA ,DNA Contamination ,Reference Standards ,030104 developmental biology ,030220 oncology & carcinogenesis ,Identity (object-oriented programming) ,Molecular Medicine ,Databases, Nucleic Acid - Abstract
Sample tracking and identity are essential when processing multiple samples in parallel. Sequencing applications often involve high sample numbers, and the data are frequently used in a clinical setting. As such, a simple and accurate intrinsic sample tracking process through a sequencing pipeline is essential. Various solutions have been implemented to verify sample identity, including variant detection at the start and end of the pipeline using arrays or genotyping, bioinformatic comparisons, and optical barcoding of samples. None of these approaches are optimal. To establish a more effective approach using genetic barcoding, we developed a panel of unique DNA sequences cloned into a common vector. A unique DNA sequence is added to the sample when it is first received and can be detected by PCR and/or sequencing at any stage of the process. The control sequences are approximately 200 bases long with low identity to any sequence in the National Center for Biotechnology Information nonredundant database (30 bases) and contain no long homopolymer (7) stretches. When a spiked next-generation sequencing library is sequenced, sequence reads derived from this control sequence are generated along with the standard sequencing run and are used to confirm sample identity and determine cross-contamination levels. This approach is used in our targeted clinical diagnostic whole-genome and RNA-sequencing pipelines and is an inexpensive, flexible, and platform-agnostic solution.
- Published
- 2019
16. Loss of lenalidomide-induced megakaryocytic differentiation leads to therapy resistance in del(5q) myelodysplastic syndrome
- Author
-
Sergio, Martinez-Høyer, Yu, Deng, Jeremy, Parker, Jihong, Jiang, Angela, Mo, T Roderick, Docking, Nadia, Gharaee, Jenny, Li, Patricia, Umlandt, Megan, Fuller, Martin, Jädersten, Austin, Kulasekararaj, Luca, Malcovati, Alan F, List, Eva, Hellström-Lindberg, Uwe, Platzbecker, and Aly, Karsan
- Subjects
GATA2 Transcription Factor ,HEK293 Cells ,Myelodysplastic Syndromes ,Core Binding Factor Alpha 2 Subunit ,Mutation ,Chromosomes, Human, Pair 5 ,Down-Regulation ,Humans ,Cell Differentiation ,Tumor Suppressor Protein p53 ,Lenalidomide ,Megakaryocytes ,Cell Line - Abstract
Interstitial deletion of the long arm of chromosome 5 (del(5q)) is the most common structural genomic variant in myelodysplastic syndromes (MDS)
- Published
- 2019
17. Fixation Effects on Variant Calling in a Clinical Resequencing Panel
- Author
-
Megan Fuller, Lucas Swanson, Shyong Quin Yap, T. Roderick Docking, Aly Karsan, Wei Xiong, Jillian Slind, Chen Zhou, Carl J. Brown, Blair Walker, Douglas Filipenko, Elizabeth Starks, Jeremy Parker, Manoj J. Raval, Ahmer A. Karimuddin, and P. Terry Phang
- Subjects
0301 basic medicine ,Paraffin Embedding ,Tissue Fixation ,Molecular genetic test ,Normal colon ,High-Throughput Nucleotide Sequencing ,Computational biology ,Sequence Analysis, DNA ,Biology ,Immunohistochemistry ,Polymorphism, Single Nucleotide ,Pathology and Forensic Medicine ,03 medical and health sciences ,030104 developmental biology ,0302 clinical medicine ,030220 oncology & carcinogenesis ,Molecular Medicine ,Humans ,Fixative ,Fixation (histology) - Abstract
Formalin fixation is the standard method for the preservation of tissue for diagnostic purposes, including pathologic review and molecular assays. However, this method is known to cause artifacts that can affect the accuracy of molecular genetic test results. We assessed the applicability of alternative fixatives to determine whether these perform significantly better on next-generation sequencing assays, and whether adequate morphology is retained for primary diagnosis, in a prospective study using a clinical-grade, laboratory-developed targeted resequencing assay. Several parameters relating to sequencing quality and variant calling were examined and quantified in tumor and normal colon epithelial tissues. We identified an alternative fixative that suppresses many formalin-related artifacts while retaining adequate morphology for pathologic review.
- Published
- 2018
18. Large-scale gene network analysis reveals the significance of extracellular matrix pathway and homeobox genes in acute myeloid leukemia: an introduction to the Pigengene package and its applications
- Author
-
Aly Karsan, Habil Zare, T. Roderick Docking, Linda Chang, Rupesh Agrahari, Gerben Duns, Monika Hudoba, and Amir Foroushani
- Subjects
0301 basic medicine ,Gene regulatory network ,RNA-Seq ,Computational biology ,Biology ,Bioconductor ,03 medical and health sciences ,Hematological malignancy ,Risk Factors ,Genetics ,Humans ,Homeobox ,Gene Regulatory Networks ,Genetics(clinical) ,Genetics (clinical) ,Oligonucleotide Array Sequence Analysis ,Homeodomain Proteins ,Leukemia ,Sequence Analysis, RNA ,Gene Expression Profiling ,Genes, Homeobox ,Myeloid leukemia ,Extracellular matrix ,Human genetics ,3. Good health ,Gene expression profiling ,Leukemia, Myeloid, Acute ,homeobox A9 ,030104 developmental biology ,Myelodysplastic Syndromes ,Cancer research ,Network analysis ,Gene expression ,DNA microarray ,Research Article - Abstract
Background The distinct types of hematological malignancies have different biological mechanisms and prognoses. For instance, myelodysplastic syndrome (MDS) is generally indolent and low risk; however, it may transform into acute myeloid leukemia (AML), which is much more aggressive. Methods We develop a novel network analysis approach that uses expression of eigengenes to delineate the biological differences between these two diseases. Results We find that specific genes in the extracellular matrix pathway are underexpressed in AML. We validate this finding in three ways: (a) We train our model on a microarray dataset of 364 cases and test it on an RNA Seq dataset of 74 cases. Our model showed 95% sensitivity and 86% specificity in the training dataset and showed 98% sensitivity and 91% specificity in the test dataset. This confirms that the identified biological signatures are independent from the expression profiling technology and independent from the training dataset. (b) Immunocytochemistry confirms that MMP9, an exemplar protein in the extracellular matrix, is underexpressed in AML. (c) MMP9 is hypermethylated in the majority of AML cases (n=194, Welch’s t-test p-value
- Published
- 2017
- Full Text
- View/download PDF
19. Kleat: cleavage site analysis of transcriptomes
- Author
-
Maayan Kreitzman, Anthony Raymond, A. Gordon Robertson, Readman Chiu, Shaun D. Jackman, T. Roderick Docking, Inanc Birol, Catherine A. Ennis, Aly Karsan, and Ka Ming Nip
- Subjects
Genetics ,Untranslated region ,Binding Sites ,Polyadenylation ,Sequence Analysis, RNA ,Alternative splicing ,Sequence assembly ,Computational Biology ,Biology ,Cleavage (embryo) ,ENCODE ,Article ,Cell Line ,Transcriptome ,ROC Curve ,Humans ,Relevant information ,3' Untranslated Regions ,Sequence Alignment ,Gene Library - Abstract
In eukaryotic cells, alternative cleavage of 3’ untranslated regions (UTRs) can affect transcript stability, transport and translation. For polyadenylated (poly(A)) transcripts, cleavage sites can be characterized with short-read sequencing using specialized library construction methods. However, for large-scale cohort studies as well as for clinical sequencing applications, it is desirable to characterize such events using RNA-seq data, as the latter are already widely applied to identify other relevant information, such as mutations, alternative splicing and chimeric transcripts. Here we describe KLEAT, an analysis tool that uses de novo assembly of RNA-seq data to characterize cleavage sites on 3’ UTRs. We demonstrate the performance of KLEAT on three cell line RNA-seq libraries constructed and sequenced by the ENCODE project, and assembled using Trans-ABySS. Validating the KLEAT predictions with matched ENCODE RNA-seq and RNA-PET libraries, we show that the tool has over 90% positive predictive value when there are at least three RNA-seq reads supporting a poly(A) tail and requiring at least three RNA-PET reads mapping within 100 nucleotides as validation. We also compare the performance of KLEAT with other popular RNA-seq analysis pipelines that reconstruct 3’ UTR ends, and show that it performs favourably, based on an ROC-like curve.
- Published
- 2015
20. Assemblathon 2: evaluating de novo methods of genome assembly in three vertebrate species
- Author
-
Dent Earl, Jacob O. Kitzman, Iain MacCallum, James R. Knight, Jacques Corbeil, Elenie Godzaridis, Cristian Del Fabbro, Paul J. Kersey, Martin Hunt, Octávio S. Paulo, Joseph Fass, Isaac Ho, Michael C. Schatz, Erich D. Jarvis, Dominique Lavenier, Simone Scalabrin, Thomas D. Otto, Nicolas Maillet, Siu-Ming Yiu, Timothy I. Shaw, David B. Jaffe, Henry Song, Ruibang Luo, Steve Goldstein, David Haussler, Francisco Pina-Martins, Richard A. Gibbs, Adam M. Phillippy, Michael Bechner, Ganeshkumar Ganapathy, Stephen Richards, Riccardo Vicedomini, Shuangye Yin, François Laviolette, Yingrui Li, T. Roderick Docking, Binghang Liu, Carson Qu, Wen-Chi Chou, Hao Zhang, Nuno A. Fonseca, Dariusz Przybylski, Bruno Vieira, Yue Liu, Matthew D. MacManes, Sébastien Boisvert, Yujian Shi, Jared T. Simpson, Sergey Kazakov, Sergey Koren, Jarrod Chapman, Giles Hall, Paul Baranay, Sante Gnerre, Shiguo Zhou, Rayan Chikhi, Filipe J. Ribeiro, Jason T. Howard, Zhenyu Li, Pavel Fedotov, Jay Shendure, J. Graham Ruby, Joseph B. Hiatt, Benedict Paten, Ian F Korf, David C. Schwartz, Keith Bradnam, Jianying Yuan, Alexey Sergushichev, Jun Wang, Hamidreza Chitsaz, Daniel S. Rokhsar, Inanc Birol, Huaiyang Jiang, Kim C. Worley, Anton Alexandrov, Zemin Ning, Delphine Naquin, Michael Place, Matthias Haimel, Guojie Zhang, Guillaume Chapuis, Fedor Tsarev, Scott J. Emrich, Shaun D. Jackman, Sergey Melnikov, Xiang Qin, Ted Sharpe, Francesco Vezzi, Tak-Wah Lam, Richard Durbin, Genome Center [UC Davis], University of California [Davis] (UC Davis), University of California (UC)-University of California (UC), National Research University of Information Technologies, Mechanics and Optics [St. Petersburg] (ITMO), Computational Biology and Bioinformatics [New Haven], Yale University [New Haven], Laboratory for Molecular and Computational Genomics [Madison], University of Wisconsin-Madison, Genome Sciences Centre [Vancouver] (GSC), British Columbia Cancer Agency, Infectious Diseases Research Center [Québec], Université Laval [Québec] (ULaval), Faculté de médecine de l'Université Laval [Québec] (ULaval), DOE Joint Genome Institute [Walnut Creek], Biological systems and models, bioinformatics and sequences (SYMBIOSE), Institut de Recherche en Informatique et Systèmes Aléatoires (IRISA), Université de Rennes (UR)-Institut National des Sciences Appliquées - Rennes (INSA Rennes), Institut National des Sciences Appliquées (INSA)-Institut National des Sciences Appliquées (INSA)-Institut National de Recherche en Informatique et en Automatique (Inria)-Centre National de la Recherche Scientifique (CNRS)-Université de Rennes (UR)-Institut National des Sciences Appliquées - Rennes (INSA Rennes), Institut National des Sciences Appliquées (INSA)-Institut National des Sciences Appliquées (INSA)-Institut National de Recherche en Informatique et en Automatique (Inria)-Centre National de la Recherche Scientifique (CNRS)-Inria Rennes – Bretagne Atlantique, Institut National de Recherche en Informatique et en Automatique (Inria), Dependability Interoperability and perfOrmance aNalYsiS Of networkS (DIONYSOS), Inria Rennes – Bretagne Atlantique, Institut National de Recherche en Informatique et en Automatique (Inria)-Institut National de Recherche en Informatique et en Automatique (Inria)-RÉSEAUX, TÉLÉCOMMUNICATION ET SERVICES (IRISA-D2), Institut National des Sciences Appliquées (INSA)-Institut National des Sciences Appliquées (INSA)-Université de Bretagne Sud (UBS)-École normale supérieure - Rennes (ENS Rennes)-Institut National de Recherche en Informatique et en Automatique (Inria)-Télécom Bretagne-CentraleSupélec-Centre National de la Recherche Scientifique (CNRS)-Université de Rennes (UR)-Institut National des Sciences Appliquées - Rennes (INSA Rennes), Institut National des Sciences Appliquées (INSA)-Institut National des Sciences Appliquées (INSA)-Université de Bretagne Sud (UBS)-École normale supérieure - Rennes (ENS Rennes)-Institut National de Recherche en Informatique et en Automatique (Inria)-Télécom Bretagne-CentraleSupélec-Centre National de la Recherche Scientifique (CNRS)-Institut de Recherche en Informatique et Systèmes Aléatoires (IRISA), Institut National des Sciences Appliquées (INSA)-Institut National des Sciences Appliquées (INSA)-Université de Bretagne Sud (UBS)-École normale supérieure - Rennes (ENS Rennes)-Télécom Bretagne-CentraleSupélec-Centre National de la Recherche Scientifique (CNRS), Department of computer science [Detroit], Wayne State University [Detroit], Institute of Bioinformatics [Athens], University of Georgia [USA], Institute of Aging Research [Boston], Hebrew SeniorLife [Boston], Department of Molecular Medicine [Québec], Institute of Applied Genomics [Udine] (IGA), Institute of Applied Genomics, The Wellcome Trust Sanger Institute [Cambridge], Howard Hughes Medical Institute [Santa Cruz] (HHMI), Howard Hughes Medical Institute (HHMI)-University of California [Santa Cruz] (UC Santa Cruz), Department of Computer Science and Engineering [South Bend], University of Notre Dame [Indiana] (UND), European Bioinformatics Institute [Hinxton] (EMBL-EBI), EMBL Heidelberg, CRACS & Inesc TEC [Porto], Universidade do Porto = University of Porto, Medical Center [Durham], Duke University [Durham], Human Genome Sequencing Center [Houston] (HGSC), Baylor College of Medicine (BCM), Baylor University-Baylor University, Broad Institute [Cambridge], Harvard University-Massachusetts Institute of Technology (MIT), Faculty of Medicine, Department of Genome Sciences [Seattle] (GS), University of Washington [Seattle], 454 Life Sciences [Branford], 454 Life Sciences, National Biodefense Analysis and Countermeasures Center [Frederick], U.S. Social Security Administration, Center for Bioinformatics and Computational Biology [Maryland] (CBCB), University of Maryland [College Park], University of Maryland System-University of Maryland System, HKU-BGI Bioinformatics Algorithms and Core Technology Research Laboratory [Hong Kong], The University of Hong Kong (HKU), Invariant Preserving SOlvers (IPSO), Institut de Recherche Mathématique de Rennes (IRMAR), Institut National des Sciences Appliquées (INSA)-Institut National des Sciences Appliquées (INSA)-École normale supérieure - Rennes (ENS Rennes)-Université de Rennes 2 (UR2)-Centre National de la Recherche Scientifique (CNRS)-INSTITUT AGRO Agrocampus Ouest, Institut national d'enseignement supérieur pour l'agriculture, l'alimentation et l'environnement (Institut Agro)-Institut national d'enseignement supérieur pour l'agriculture, l'alimentation et l'environnement (Institut Agro)-Université de Rennes (UR)-Institut National des Sciences Appliquées - Rennes (INSA Rennes), Institut national d'enseignement supérieur pour l'agriculture, l'alimentation et l'environnement (Institut Agro)-Institut national d'enseignement supérieur pour l'agriculture, l'alimentation et l'environnement (Institut Agro)-Inria Rennes – Bretagne Atlantique, Institut National de Recherche en Informatique et en Automatique (Inria)-Institut National de Recherche en Informatique et en Automatique (Inria), Department of Computer Science and Software Engineering [Québec], Beijing Genomics Institute [Shenzhen] (BGI), Berkeley California Institute for Quantitative Biosciences [Berkeley], University of California (UC), Computational Biology & Population Genomics Group [Lisboa], Centre for Environmental Biology, New York Genome Center [New York], New York Genome Center, Department of Molecular and Cell Biology, Department of Biochemistry and Biophysics, Howard Hughes Medical Institute (HHMI), Simons Center for Quantitative Biology [Cold Spring Harbor], Cold Spring Harbor Laboratory, Department of Epidemiology and Biostatistics [Athens], University of Georgia [USA]-College of Public Health, Science for Life Laboratory [Solna], Royal Institute of Technology [Stockholm] (KTH ), Department of Mathematics and Computer Science [Udine], Università degli Studi di Udine - University of Udine [Italie], University of California-University of California, Université de Rennes 1 (UR1), Université de Rennes (UNIV-RENNES)-Université de Rennes (UNIV-RENNES)-Institut National des Sciences Appliquées - Rennes (INSA Rennes), Institut National des Sciences Appliquées (INSA)-Université de Rennes (UNIV-RENNES)-Institut National des Sciences Appliquées (INSA)-Institut National de Recherche en Informatique et en Automatique (Inria)-Centre National de la Recherche Scientifique (CNRS)-Université de Rennes 1 (UR1), Institut National des Sciences Appliquées (INSA)-Université de Rennes (UNIV-RENNES)-Institut National des Sciences Appliquées (INSA)-Institut National de Recherche en Informatique et en Automatique (Inria)-Centre National de la Recherche Scientifique (CNRS)-Inria Rennes – Bretagne Atlantique, CentraleSupélec-Télécom Bretagne-Université de Rennes 1 (UR1), Université de Rennes (UNIV-RENNES)-Université de Rennes (UNIV-RENNES)-Institut National de Recherche en Informatique et en Automatique (Inria)-École normale supérieure - Rennes (ENS Rennes)-Université de Bretagne Sud (UBS)-Centre National de la Recherche Scientifique (CNRS)-Institut National des Sciences Appliquées - Rennes (INSA Rennes), Institut National des Sciences Appliquées (INSA)-Université de Rennes (UNIV-RENNES)-Institut National des Sciences Appliquées (INSA)-CentraleSupélec-Télécom Bretagne-Université de Rennes 1 (UR1), Institut National des Sciences Appliquées (INSA)-Université de Rennes (UNIV-RENNES)-Institut National des Sciences Appliquées (INSA)-Institut de Recherche en Informatique et Systèmes Aléatoires (IRISA), Université de Rennes (UNIV-RENNES)-Université de Rennes (UNIV-RENNES)-École normale supérieure - Rennes (ENS Rennes)-Université de Bretagne Sud (UBS)-Centre National de la Recherche Scientifique (CNRS)-Institut National des Sciences Appliquées - Rennes (INSA Rennes), Institut National des Sciences Appliquées (INSA)-Université de Rennes (UNIV-RENNES)-Institut National des Sciences Appliquées (INSA), Howard Hughes Medical Institute (HHMI)-University of California [Santa Cruz] (UCSC), Universidade do Porto, Harvard University [Cambridge]-Massachusetts Institute of Technology (MIT), AGROCAMPUS OUEST, Institut national d'enseignement supérieur pour l'agriculture, l'alimentation et l'environnement (Institut Agro)-Institut national d'enseignement supérieur pour l'agriculture, l'alimentation et l'environnement (Institut Agro)-Université de Rennes 1 (UR1), Université de Rennes (UNIV-RENNES)-Université de Rennes (UNIV-RENNES)-Université de Rennes 2 (UR2), Université de Rennes (UNIV-RENNES)-École normale supérieure - Rennes (ENS Rennes)-Centre National de la Recherche Scientifique (CNRS)-Institut National des Sciences Appliquées - Rennes (INSA Rennes), Institut National des Sciences Appliquées (INSA)-Université de Rennes (UNIV-RENNES)-Institut National des Sciences Appliquées (INSA)-AGROCAMPUS OUEST, Institut National des Sciences Appliquées (INSA)-Université de Rennes (UNIV-RENNES)-Institut National des Sciences Appliquées (INSA)-Inria Rennes – Bretagne Atlantique, and University of California
- Subjects
[INFO.INFO-AR]Computer Science [cs]/Hardware Architecture [cs.AR] ,Computer science ,Sequence assembly ,GENOMES ,Health Informatics ,Computational biology ,Assessment ,COMPASS ,Genome ,03 medical and health sciences ,0302 clinical medicine ,biology.animal ,Quantitative Biology - Genomics ,Gene ,030304 developmental biology ,Sequence (medicine) ,Scaffolds ,Genomics (q-bio.GN) ,Whole genome sequencing ,0303 health sciences ,Genome assembly ,Heterozygosity ,biology ,N50 ,Research ,Vertebrate ,De novo Assembly ,Computer Science Applications ,Fosmid ,FOS: Biological sciences ,Scalability ,030217 neurology & neurosurgery - Abstract
Background - The process of generating raw genome sequence data continues to become cheaper, faster, and more accurate. However, assembly of such data into high-quality, finished genome sequences remains challenging. Many genome assembly tools are available, but they differ greatly in terms of their performance (speed, scalability, hardware requirements, acceptance of newer read technologies) and in their final output (composition of assembled sequence). More importantly, it remains largely unclear how to best assess the quality of assembled genome sequences. The Assemblathon competitions are intended to assess current state-of-the-art methods in genome assembly. Results - In Assemblathon 2, we provided a variety of sequence data to be assembled for three vertebrate species (a bird, a fish, and snake). This resulted in a total of 43 submitted assemblies from 21 participating teams. We evaluated these assemblies using a combination of optical map data, Fosmid sequences, and several statistical methods. From over 100 different metrics, we chose ten key measures by which to assess the overall quality of the assemblies. Conclusions - Many current genome assemblers produced useful assemblies, containing a significant representation of their genes, regulatory sequences, and overall genome structure. However, the high degree of variability between the entries suggests that there is still much room for improvement in the field of genome assembly and that approaches which work well in assembling the genome of one species may not necessarily work well for another., Comment: Additional files available at http://korflab.ucdavis.edu/Datasets/Assemblathon/Assemblathon2/Additional_files/ Major changes 1. Accessions for the 3 read data sets have now been included 2. New file: spreadsheet containing details of all Study, Sample, Run, & Experiment identifiers 3. Made miscellaneous changes to address reviewers comments. DOIs added to GigaDB datasets
- Published
- 2013
- Full Text
- View/download PDF
21. Barnacle: detecting and characterizing tandem duplications and fusions in transcriptome assemblies
- Author
-
Shaun D. Jackman, S. Cenk Sahinalp, Inanc Birol, Jenny Q. Qian, Sherry Wang, Pamela A. Hoodless, Nina Thiessen, Jeremy Parker, Andrew J. Mungall, Lucas Swanson, Yaron S. Butterfield, Readman Chiu, Richard A. Moore, Anthony Raymond, Donna E. Hogge, Richard Varhol, Yongjun Zhao, Aly Karsan, Ka Ming Nip, Angela Tam, Richard Corbett, Deniz Yorukoglu, Sandy Sung, Gordon Robertson, T. Roderick Docking, Karen Mungall, Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science, and Yorukoglu, Deniz
- Subjects
Statistics as Topic ,RNA-Seq ,Genomics ,Breast Neoplasms ,Biology ,Proteomics ,Fusion gene ,03 medical and health sciences ,Chimera (genetics) ,0302 clinical medicine ,Gene Duplication ,Gene duplication ,Genetics ,Humans ,RNA, Messenger ,Chimeric transcripts ,Fusion ,Internal tandem duplication ,030304 developmental biology ,Partial tandem duplication ,0303 health sciences ,Transcriptome assembly ,Methodology Article ,Gene Expression Profiling ,Molecular Sequence Annotation ,Exons ,3. Good health ,Gene expression profiling ,Leukemia, Myeloid, Acute ,PTD ,ITD ,030220 oncology & carcinogenesis ,DNA microarray ,RNA-seq ,Gene Fusion ,Transcriptome ,Biotechnology - Abstract
Background: Chimeric transcripts, including partial and internal tandem duplications (PTDs, ITDs) and gene fusions, are important in the detection, prognosis, and treatment of human cancers. Results: We describe Barnacle, a production-grade analysis tool that detects such chimeras in de novo assemblies of RNA-seq data, and supports prioritizing them for review and validation by reporting the relative coverage of co-occurring chimeric and wild-type transcripts. We demonstrate applications in large-scale disease studies, by identifying PTDs in MLL, ITDs in FLT3, and reciprocal fusions between PML and RARA, in two deeply sequenced acute myeloid leukemia (AML) RNA-seq datasets. Conclusions: Our analyses of real and simulated data sets show that, with appropriate filter settings, Barnacle makes highly specific predictions for three types of chimeric transcripts that are important in a range of cancers: PTDs, ITDs, and fusions. High specificity makes manual review and validation efficient, which is necessary in large-scale disease studies. Characterizing an extended range of chimera types will help generate insights into progression, treatment, and outcomes for complex diseases., Simon Fraser University. Bioinformatics for Combating Infectious Disease Project, Simon Fraser University (Graduate Fellowship), Pacific Century Institute (Graduate Scholarship), Genome Canada (Firm), Canadian Institutes of Health Research, Genome British Columbia (Firm) (Grant #121AML), Provincial Health Services Authority (British Columbia, Canada), BC Cancer Foundation
- Published
- 2013
22. A clinically validated diagnostic second-generation sequencing assay for detection of hereditary BRCA1 and BRCA2 mutations
- Author
-
Martin Hirst, Sean S. Young, Richard A. Moore, Robert A. Holt, Inanc Birol, Steven J.M. Jones, Aly Karsan, Thomas Zeng, Miruna Bala, Marco A. Marra, Ian Bosdet, Andrew J. Mungall, Yaron S. Butterfield, Robin Coope, Katie Chow, Erika Yorida, and T. Roderick Docking
- Subjects
Sanger sequencing ,Genetics ,Base Sequence ,DNA Mutational Analysis ,Genes, BRCA2 ,Genes, BRCA1 ,High-Throughput Nucleotide Sequencing ,Sequence alignment ,Biology ,Sensitivity and Specificity ,Pathology and Forensic Medicine ,symbols.namesake ,Gene Frequency ,symbols ,Molecular Medicine ,Hereditary Breast and Ovarian Cancer Syndrome ,Humans ,Genomic library ,Prospective Studies ,Allele frequency ,Gene ,Exome sequencing ,Hereditary Breast Cancer ,Sequence (medicine) ,Gene Library - Abstract
Individuals who inherit mutations in BRCA1 or BRCA2 are predisposed to breast and ovarian cancers. However, identifying mutations in these large genes by conventional dideoxy sequencing in a clinical testing laboratory is both time consuming and costly, and similar challenges exist for other large genes, or sets of genes, with relevance in the clinical setting. Second-generation sequencing technologies have the potential to improve the efficiency and throughput of clinical diagnostic sequencing, once clinically validated methods become available. We have developed a method for detection of variants based on automated small-amplicon PCR followed by sample pooling and sequencing with a second-generation instrument. To demonstrate the suitability of this method for clinical diagnostic sequencing, we analyzed the coding exons and the intron–exon boundaries of BRCA1 and BRCA2 in 91 hereditary breast cancer patient samples. Our method generated high-quality sequence coverage across all targeted regions, with median coverage greater than 4000-fold for each sample in pools of 24. Sensitive and specific automated variant detection, without false-positive or false-negative results, was accomplished with a standard software pipeline using bwa for sequence alignment and samtools for variant detection. We experimentally derived a minimum threshold of 100-fold sequence depth for confident variant detection. The results demonstrate that this method is suitable for sensitive, automatable, high-throughput sequence variant detection in the clinical laboratory.
- Published
- 2012
23. Assemblathon 1: A competitive assessment of de novo short read assembly methods
- Author
-
Inanc Birol, Peter Skewes-Cox, Dariusz Przybylski, Paul J. Kersey, Guillaume Chapuis, Matthias Haimel, Sante Gnerre, Mark Diekhans, Petr Kosarev, Richard E. Green, Igor Seledtsov, Richard Durbin, Daniel S. Rokhsar, Miguel Betegon, Ngan Nguyen, Isaac Ho, J. Graham Ruby, Michelle Dimon, Aaron E. Darling, Ricardo H. Ramirez-Gonzalez, Daniel R. Zerbino, David R. Kelley, Ruibang Luo, Ian F Korf, Richard M. Leggett, Timothy I. Shaw, Keith Bradnam, Benedict Paten, Yingrui Li, Giles Hall, Vince Buffalo, Yinlong Xie, Shuangye Yin, Xiaoqiu Huang, Shaun D. Jackman, Victor V. Solovyev, Denis Vorobyev, Binghang Liu, T. Roderick Docking, Joseph L. DeRisi, Delphine Naquin, Ted Sharpe, Adam M. Phillippy, Mario Caccamo, Zemin Ning, Wing-Kin Sung, Jarrod Chapman, David B. Jaffe, John St. John, Pramila N. Ariyaratne, David Haussler, Fangfang Xia, Wen-Chi Chou, Sergey Koren, Dan MacLean, Joseph Fass, Rayan Chikhi, Iain MacCallum, Nicolas Maillet, Hung On Ken Yu, Nuno A. Fonseca, Dominique Lavenier, Michael C. Schatz, Dent Earl, Anuj Srivastava, Dawei Lin, Zhenyu Li, Wei Wu, Filipe J. Ribeiro, Shiaw-Pyng Yang, Jared T. Simpson, Center for Biomolecular Science and Engineering, University of California [Santa Cruz] (UC Santa Cruz), University of California (UC)-University of California (UC), Biomolecular Engineering Department, Genome Center [UC Davis], University of California [Davis] (UC Davis), Bioinformatics Core [University California Davis] (UC Davis), Computational and Mathematical Biology, Genome Institute of Singapore (GIS), School of computing [Singapore] (NUS), National University of Singapore (NUS), The Wellcome Trust Sanger Institute [Cambridge], European Bioinformatics Institute [Hinxton] (EMBL-EBI), EMBL Heidelberg, Center for Research in Advanced Computing Systems (CRACS INESC), Faculdade de Ciências da Universidade do Porto (FCUP), Universidade do Porto = University of Porto-Universidade do Porto = University of Porto, Genome Sciences Centre [Vancouver] (GSC), British Columbia Cancer Agency, DOE Joint Genome Institute [Walnut Creek], Department of Molecular & Cell Biology [Berkeley], University of California [Berkeley] (UC Berkeley), Biological systems and models, bioinformatics and sequences (SYMBIOSE), Institut de Recherche en Informatique et Systèmes Aléatoires (IRISA), Université de Rennes (UR)-Institut National des Sciences Appliquées - Rennes (INSA Rennes), Institut National des Sciences Appliquées (INSA)-Institut National des Sciences Appliquées (INSA)-Institut National de Recherche en Informatique et en Automatique (Inria)-Centre National de la Recherche Scientifique (CNRS)-Université de Rennes (UR)-Institut National des Sciences Appliquées - Rennes (INSA Rennes), Institut National des Sciences Appliquées (INSA)-Institut National des Sciences Appliquées (INSA)-Institut National de Recherche en Informatique et en Automatique (Inria)-Centre National de la Recherche Scientifique (CNRS)-Inria Rennes – Bretagne Atlantique, Institut National de Recherche en Informatique et en Automatique (Inria), Simons Center for Quantitative Biology [Cold Spring Harbor], Cold Spring Harbor Laboratory, Center for Bioinformatics and Computational Biology [Maryland] (CBCB), University of Maryland [College Park], University of Maryland System-University of Maryland System, National Biodefense Analysis and Countermeasures Center [Frederick], U.S. Social Security Administration, Monsanto Company, Institute of Bioinformatics [Georgia] (IOB), University of Georgia [USA], Howard Hughes Medical Institute [Chevy Chase] (HHMI), Howard Hughes Medical Institute (HHMI), Department of Biochemistry and Biophysics [San Francisco], University of California (UC), Biological and Medical Informatics [San Francisco] (BMI), University of California [San Francisco] (UC San Francisco), Department of Computer Science [Royal Holloway], Royal Holloway [University of London] (RHUL), Softberry Inc, Softberry, The Genome Analysis Centre (TGAC), Sainsbury Laboratory Cambridge University (SLCU), University of Cambridge [UK] (CAM), Computation Institute [Chicago], University of Chicago, Beijing Genomics Institute [Shenzhen] (BGI), Broad Institute [Cambridge], Harvard University-Massachusetts Institute of Technology (MIT), Department of Computer Science [Ames], Iowa State University (ISU), University of California [Santa Cruz] (UCSC), University of California-University of California, Bioinformatics Core [UC Davis], Universidade do Porto-Universidade do Porto, University of California [Berkeley], Université de Rennes 1 (UR1), Université de Rennes (UNIV-RENNES)-Université de Rennes (UNIV-RENNES)-Institut National des Sciences Appliquées - Rennes (INSA Rennes), Institut National des Sciences Appliquées (INSA)-Université de Rennes (UNIV-RENNES)-Institut National des Sciences Appliquées (INSA)-Institut National de Recherche en Informatique et en Automatique (Inria)-Centre National de la Recherche Scientifique (CNRS)-Université de Rennes 1 (UR1), Institut National des Sciences Appliquées (INSA)-Université de Rennes (UNIV-RENNES)-Institut National des Sciences Appliquées (INSA)-Institut National de Recherche en Informatique et en Automatique (Inria)-Centre National de la Recherche Scientifique (CNRS)-Inria Rennes – Bretagne Atlantique, University of California, University of California [San Francisco] (UCSF), and Massachusetts Institute of Technology (MIT)-Harvard University [Cambridge]
- Subjects
Genetics ,Resource ,0303 health sciences ,Genome ,Bioinformatics ,Sequence assembly ,Genomics ,Computational biology ,Sequence Analysis, DNA ,Biology ,Short read ,[SDV.BIBS]Life Sciences [q-bio]/Quantitative Methods [q-bio.QM] ,03 medical and health sciences ,0302 clinical medicine ,Code (cryptography) ,Benchmark (computing) ,Data set (IBM mainframe) ,Base calling ,[INFO.INFO-BI]Computer Science [cs]/Bioinformatics [q-bio.QM] ,030217 neurology & neurosurgery ,Genetics (clinical) ,030304 developmental biology - Abstract
Low-cost short read sequencing technology has revolutionized genomics, though it is only just becoming practical for the high-quality de novo assembly of a novel large genome. We describe the Assemblathon 1 competition, which aimed to comprehensively assess the state of the art in de novo assembly methods when applied to current sequencing technologies. In a collaborative effort, teams were asked to assemble a simulated Illumina HiSeq data set of an unknown, simulated diploid genome. A total of 41 assemblies from 17 different groups were received. Novel haplotype aware assessments of coverage, contiguity, structure, base calling, and copy number were made. We establish that within this benchmark: (1) It is possible to assemble the genome to a high level of coverage and accuracy, and that (2) large differences exist between the assemblies, suggesting room for further improvements in current methods. The simulated benchmark, including the correct answer, the assemblies, and the code that was used to evaluate the assemblies is now public and freely available from http://www.assemblathon.org/.
- Published
- 2011
- Full Text
- View/download PDF
24. Updated genome assembly and annotation of Paenibacillus larvae, the agent of American foulbrood disease of honey bees
- Author
-
Nancy Y. Liao, Jay D. Evans, Inanc Birol, Dirk C. de Graaf, T. Roderick Docking, R. Scott Cornman, Greg Taylor, Simon K. Chan, Leonard J. Foster, Queenie W.T. Chan, Shaun D. Jackman, and Steven J.M. Jones
- Subjects
DNA, Bacterial ,Proteomics ,American foulbrood ,lcsh:QH426-470 ,lcsh:Biotechnology ,Sequence assembly ,Genomics ,Genome ,BACILLUS-SUBTILIS ,03 medical and health sciences ,Paenibacillus ,lcsh:TP248.13-248.65 ,SURFACTIN ,Genetics ,Animals ,BIOSYNTHESIS ,PATHOGEN ,STAPHYLOCOCCUS-AUREUS ,030304 developmental biology ,2. Zero hunger ,Comparative genomics ,Comparative Genomic Hybridization ,0303 health sciences ,biology ,030306 microbiology ,Computational Biology ,Biology and Life Sciences ,PATHWAYS ,Molecular Sequence Annotation ,Sequence Analysis, DNA ,IN-VITRO ,Honey bee ,Bees ,biology.organism_classification ,PROTEASES ,lcsh:Genetics ,APIS-MELLIFERA ,BACTERIA ,Genome, Bacterial ,Research Article ,Biotechnology - Abstract
Background As scientists continue to pursue various 'omics-based research, there is a need for high quality data for the most fundamental 'omics of all: genomics. The bacterium Paenibacillus larvae is the causative agent of the honey bee disease American foulbrood. If untreated, it can lead to the demise of an entire hive; the highly social nature of bees also leads to easy disease spread, between both individuals and colonies. Biologists have studied this organism since the early 1900s, and a century later, the molecular mechanism of infection remains elusive. Transcriptomics and proteomics, because of their ability to analyze multiple genes and proteins in a high-throughput manner, may be very helpful to its study. However, the power of these methodologies is severely limited without a complete genome; we undertake to address that deficiency here. Results We used the Illumina GAIIx platform and conventional Sanger sequencing to generate a 182-fold sequence coverage of the P. larvae genome, and assembled the data using ABySS into a total of 388 contigs spanning 4.5 Mbp. Comparative genomics analysis against fully-sequenced soil bacteria P. JDR2 and P. vortex showed that regions of poor conservation may contain putative virulence factors. We used GLIMMER to predict 3568 gene models, and named them based on homology revealed by BLAST searches; proteases, hemolytic factors, toxins, and antibiotic resistance enzymes were identified in this way. Finally, mass spectrometry was used to provide experimental evidence that at least 35% of the genes are expressed at the protein level. Conclusions This update on the genome of P. larvae and annotation represents an immense advancement from what we had previously known about this species. We provide here a reliable resource that can be used to elucidate the mechanism of infection, and by extension, more effective methods to control and cure this widespread honey bee disease.
- Published
- 2011
- Full Text
- View/download PDF
25. Gene discovery for the bark beetle-vectored fungal tree pathogen Grosmannia clavigera
- Author
-
Robert A. Holt, Scott DiGuistini, Christopher I. Keeling, Maria Li, T. Roderick Docking, Ye Wang, Gordon Robertson, Colette Breuil, Jörg Bohlmann, Hannah Henderson, Nancy Y. Liao, Uljana Hesse-Orce, and Steven J.M. Jones
- Subjects
0106 biological sciences ,lcsh:QH426-470 ,lcsh:Biotechnology ,Genes, Fungal ,Phloem ,Biology ,01 natural sciences ,Genome ,Trees ,03 medical and health sciences ,Gene Expression Regulation, Fungal ,lcsh:TP248.13-248.65 ,Databases, Genetic ,Botany ,Genetics ,Animals ,RNA, Messenger ,Gene ,Gene Library ,030304 developmental biology ,Expressed Sequence Tags ,Ophiostomatales ,0303 health sciences ,Expressed sequence tag ,Water transport ,Mycelium ,Plant Extracts ,Reverse Transcriptase Polymerase Chain Reaction ,Fungal genetics ,food and beverages ,Grosmannia clavigera ,Spores, Fungal ,15. Life on land ,Pinus ,biology.organism_classification ,Insect Vectors ,Coleoptera ,lcsh:Genetics ,Clavigera ,Plant Bark ,Metabolic Networks and Pathways ,Research Article ,010606 plant biology & botany ,Biotechnology ,Reference genome - Abstract
Background Grosmannia clavigera is a bark beetle-vectored fungal pathogen of pines that causes wood discoloration and may kill trees by disrupting nutrient and water transport. Trees respond to attacks from beetles and associated fungi by releasing terpenoid and phenolic defense compounds. It is unclear which genes are important for G. clavigera's ability to overcome antifungal pine terpenoids and phenolics. Results We constructed seven cDNA libraries from eight G. clavigera isolates grown under various culture conditions, and Sanger sequenced the 5' and 3' ends of 25,000 cDNA clones, resulting in 44,288 high quality ESTs. The assembled dataset of unique transcripts (unigenes) consists of 6,265 contigs and 2,459 singletons that mapped to 6,467 locations on the G. clavigera reference genome, representing ~70% of the predicted G. clavigera genes. Although only 54% of the unigenes matched characterized proteins at the NCBI database, this dataset extensively covers major metabolic pathways, cellular processes, and genes necessary for response to environmental stimuli and genetic information processing. Furthermore, we identified genes expressed in spores prior to germination, and genes involved in response to treatment with lodgepole pine phloem extract (LPPE). Conclusions We provide a comprehensively annotated EST dataset for G. clavigera that represents a rich resource for gene characterization in this and other ophiostomatoid fungi. Genes expressed in response to LPPE treatment are indicative of fungal oxidative stress response. We identified two clusters of potentially functionally related genes responsive to LPPE treatment. Furthermore, we report a simple method for identifying contig misassemblies in de novo assembled EST collections caused by gene overlap on the genome.
- Published
- 2010
- Full Text
- View/download PDF
26. A regulatory toolbox of MiniPromoters to drive selective expression in the brain
- Author
-
Bibiana K. Y. Wong, Randy Glenn, Cletus D'Souza, Vik Chopra, Magdalena I. Swanson, Meifen Lu, Flora Liu, Douglas J. Swanson, Marko Milisavljevic, Diana L. Palmquist, Taryn G. Hearty, Jenna L. Turner, Jonathan S. Lim, Steven J.M. Jones, Debra L. Fulton, Erin K. Flynn, Wyeth W. Wasserman, Dan Goldowitz, Betty Palma, Sonia F. Black, David J. Arenillas, Steven Jiang, Jing M. Chen, Charles N. de Leeuw, Russell J. Bonaguro, Stéphanie Laprise, Kathleen G. Banks, Athena R. Ypsilanti, Jacek Mis, Mahsa Amirabbasi, Ivana Komljenovic, Katie O’Connor, Robert A. Holt, Nancy Y. Liao, Ying Chen, Russell F. Watkins, Stuart Lithwick, Shannan J. Ho Sui, Elodie Portales-Casamar, Richard Varhol, Bonny Tam, Jun Liu, Li Liu, Lisa Dreolini, Mauro Castellarin, Andrea J. McLeod, Amy Ticoll, Nazar Babyak, Kristi Hatakka, Elizabeth M. Simpson, Jason C. Y. Cheng, Gary M. Wilson, Siaw H. Wong, Erich Brauer, Jean-François Schmouth, Behzad Imanian, Melissa K. McConechy, Johar Ali, Shadi Khorasan-zadeh, Jenny Vermeulen, T. Roderick Docking, George S. Yang, Tony Wong, and Tara Candido
- Subjects
Genetically modified mouse ,Cellular differentiation ,Gene Expression ,Mice, Transgenic ,Computational biology ,Biology ,Regulatory Sequences, Nucleic Acid ,Mice ,Genes, Reporter ,Gene expression ,Databases, Genetic ,Animals ,Humans ,Gene Knock-In Techniques ,Promoter Regions, Genetic ,Gene ,Embryonic Stem Cells ,Genetics ,Neurons ,Reporter gene ,Multidisciplinary ,Gene Expression Profiling ,Brain ,Computational Biology ,Cell Differentiation ,Genomics ,Biological Sciences ,Embryonic stem cell ,Gene expression profiling ,Regulatory sequence - Abstract
The Pleiades Promoter Project integrates genomewide bioinformatics with large-scale knockin mouse production and histological examination of expression patterns to develop MiniPromoters and related tools designed to study and treat the brain by directed gene expression. Genes with brain expression patterns of interest are subjected to bioinformatic analysis to delineate candidate regulatory regions, which are then incorporated into a panel of compact human MiniPromoters to drive expression to brain regions and cell types of interest. Using single-copy, homologous-recombination “knockins” in embryonic stem cells, each MiniPromoter reporter is integrated immediately 5′ of the Hprt locus in the mouse genome. MiniPromoter expression profiles are characterized in differentiation assays of the transgenic cells or in mouse brains following transgenic mouse production. Histological examination of adult brains, eyes, and spinal cords for reporter gene activity is coupled to costaining with cell-type–specific markers to define expression. The publicly available Pleiades MiniPromoter Project is a key resource to facilitate research on brain development and therapies.
- Published
- 2010
27. Retrotransposon sequence variation in four asexual plant species
- Author
-
Fabienne E. Saade, Miranda C. Elliott, Daniel J. Schoen, and T. Roderick Docking
- Subjects
Genetics ,Retroelements ,Sequence Homology, Amino Acid ,Molecular Sequence Data ,food and beverages ,Genetic Variation ,Sequence alignment ,Retrotransposon ,Biology ,Plants ,Mating system ,Polymerase Chain Reaction ,Asexuality ,Evolution, Molecular ,Negative selection ,Apomixis ,Genetic variation ,Reproduction, Asexual ,Computer Simulation ,Amino Acid Sequence ,Selfish DNA ,Molecular Biology ,Sequence Alignment ,Ecology, Evolution, Behavior and Systematics - Abstract
Transposable elements (TEs) can be viewed as genetic parasites that persist in populations due to their capacity for increase in copy number and the inefficacy of selection against them. A corollary of this hypothesis is that TEs are more likely to spread within sexual populations and be eliminated or inactivated within asexual populations. While previous work with animals has shown that asexual taxa may contain less TE diversity than sexual taxa, comparable work with plants has been lacking. Here we report the results of a study of Ty1/copia, Ty3/gypsy, and LINE-like retroelement diversity in four asexual plant species. Retroelement-like sequences, with a high degree of conservation both within and between species, were isolated from all four species. The sequences correspond to several previously annotated retroelement subfamilies. They also exhibit a pattern of nucleotide substitution characterized by an excess of synonymous substitutions, suggestive of a history of purifying selection. These findings were compared with retroelement sequence evolution in sexual plant taxa. One likely explanation for the discovery of conserved TE sequences in the genomes of these asexual taxa is simply that asexuality within these taxa evolved relatively recently, such that the loss and breakdown of TEs is not yet detectable through analysis of sequence diversity. This explanation is examined by conducting stochastic simulation of TE evolution and by using published information to infer rough estimates of the ages of asexual taxa.
- Published
- 2004
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.