1. Draft Genome Assembly, Organelle Genome Sequencing and Diversity Analysis of Marama Bean (Tylosema esculentum), the Green Gold of Africa
- Author
-
Li, Jin
- Subjects
- Bioinformatics, Genetics, Plant Biology, Tylosema esculentum, legume, genome assembly, genome diversity, organelle genome, comparative genomics, mitogenome structure, heteroplasmy
- Abstract
Tylosema esculentum (marama bean) is an underutilized legume, long considered as a local potential crop due to its rich nutritional value. The reference plastome and mitogenome were assembled using a hybrid method with both Illumina and PacBio data. The diversity was explored with the WGS data of 84 samples from various geographic locations in Namibia and Pretoria. Phylogenetic analysis revealed two cytotypes with distinct plastomes and mitogenomes with differing levels of variability. Deep sequencing has identified heteroplasmy with both types of organellar genomes present, albeit one at a very low frequency. The inheritance of this complex of organellar genomes appears to be fairly constant, providing a conundrum of how the two genomes co-exist and are propagated through generations.The type 1 mitogenome has two autonomous rings with a total length of 399,572 bp, which can be restructured into five smaller circular molecules through recombination on 3 pairs of long direct repeats. The type 2 mitogenome contains a unique 2,108 bp sequence, which connects distant segments to form a new structure consisting of three circular molecules and one linear chromosome. This increased the copy number of nad9, rrns, rrn5, trnC, and trnfM. The two mitogenomes differed at another 230 loci, with only one nonsynonymous substitution in matR. cpDNA insertions were concentrated in one subgenomic ring of the mitogenome, including a 9,798 bp long fragment that contains potential psbC, rps14, psaA, and psaB pseudogenes. The two types of plastomes range in length from 161,537 bp to 161,580 bp, differing at 122 loci and at a 230 bp inversion. The chloroplast genes rpoC2, rpoB, and ndhD were found to be more diverse than other genes in marama plastome. 21.6 Gb PacBio HiFi data was assembled using Canu v2.2 into an unphased assembly of 1.24 Gb. k-mer analysis indicated that marama may be ancient tetraploid with an estimated genome size of only 277 Mb. The generated assembly has an N50 value of 1.28 Mb, with some contigs nearly half the length of the Bauhinia chromosome. The BUSCO completeness was 99.5% and repetitive sequences accounted for 27.35% of the assembly.
- Published
- 2023