1. Chromosome-level de novo genome assembly of Telopea speciosissima (New South Wales waratah) using long-reads, linked-reads and Hi-C.
- Author
-
Chen SH, Rossetto M, van der Merwe M, Lu-Irving P, Yap JS, Sauquet H, Bourke G, Amos TG, Bragg JG, and Edwards RJ
- Subjects
- Australia, Genome Size, Molecular Sequence Annotation, New South Wales, Chromosomes, Proteaceae genetics
- Abstract
Telopea speciosissima, the New South Wales waratah, is an Australian endemic woody shrub in the family Proteaceae. Waratahs have great potential as a model clade to better understand processes of speciation, introgression and adaptation, and are significant from a horticultural perspective. Here, we report the first chromosome-level genome for T. speciosissima. Combining Oxford Nanopore long-reads, 10x Genomics Chromium linked-reads and Hi-C data, the assembly spans 823 Mb (scaffold N50 of 69.0 Mb) with 97.8% of Embryophyta BUSCOs "Complete". We present a new method in Diploidocus (https://github.com/slimsuite/diploidocus) for classifying, curating and QC-filtering scaffolds, which combines read depths, k-mer frequencies and BUSCO predictions. We also present a new tool, DepthSizer (https://github.com/slimsuite/depthsizer), for genome size estimation from the read depth of single-copy orthologues and estimate the genome size to be approximately 900 Mb. The largest 11 scaffolds contained 94.1% of the assembly, conforming to the expected number of chromosomes (2n = 22). Genome annotation predicted 40,158 protein-coding genes, 351 rRNAs and 728 tRNAs. We investigated CYCLOIDEA (CYC) genes, which have a role in determination of floral symmetry, and confirm the presence of two copies in the genome. Read depth analysis of 180 "Duplicated" BUSCO genes using a new tool, DepthKopy (https://github.com/slimsuite/depthkopy), suggests almost all are real duplications, increasing confidence in the annotation and highlighting a possible need to revise the BUSCO set for this lineage. The chromosome-level T. speciosissima reference genome (Tspe_v1) provides an important new genomic resource of Proteaceae to support the conservation of flora in Australia and further afield., (© 2022 John Wiley & Sons Ltd.)
- Published
- 2022
- Full Text
- View/download PDF