1. Genomic exploration and molecular marker development in a large and complex conifer genome using RADseq and mRNAseq
- Author
-
M. Bou Dagher-Kharrat, Giovanni G. Vendramin, Marie-Joe Karam, Sara Pinosio, François Lefèvre, Unité de Recherches Forestières Méditerranéennes (URFM), Institut National de la Recherche Agronomique (INRA), laboratoire de Caractérisation Génomique des Plantes, Université Saint-Joseph de Beyrouth (USJ), Applied Genomics Institute (IGA), Institute of Biosciences and Bioresources (IBBR), Consiglio Nazionale delle Ricerche [Roma] (CNR), This project was supported by the Agropolis Fondation (Montpellier, France) under the reference ID 'BIOFIS' 1001-001, Istituto di Bioscienze e BioRisorse [Palermo] (IBBR), and Consiglio Nazionale delle Ricerche (CNR)
- Subjects
0106 biological sciences ,Genetic Markers ,DNA, Plant ,Molecular Sequence Data ,SNP ,Single-nucleotide polymorphism ,Computational biology ,Biology ,01 natural sciences ,Genome ,Polymorphism, Single Nucleotide ,DNA sequencing ,03 medical and health sciences ,chemistry.chemical_compound ,Cedrus atlantica ,next generation sequencing ,RADseq ,SSR ,transcriptome ,Molecular marker ,Genetics ,Genotyping ,Cedrus ,Ecology, Evolution, Behavior and Systematics ,030304 developmental biology ,0303 health sciences ,Contig ,High-Throughput Nucleotide Sequencing ,Sequence Analysis, DNA ,Restriction site ,chemistry ,Microsatellite ,[SDE.BE]Environmental Sciences/Biodiversity and Ecology ,Genome, Plant ,010606 plant biology & botany ,Biotechnology - Abstract
We combined restriction site associated DNA sequencing (RADseq) using a hypomethylation-sensitive enzyme and messenger RNA sequencing (mRNAseq) to develop molecular markers for the 16 gigabase genome of[i] Cedrus atlantica[/i], a conifer tree species. With each method, Illumina(®) reads from one individual were used to generate de novo assemblies. SNPs from the RADseq data set were detected in a panel of one single individual and three pools of three individuals each. We developed a flexible script to estimate the ascertainment bias in SNP detection considering the pooling and sampling effects on the probability of not detecting an existing polymorphism. Gene Ontology (GO) and transposable element (TE) search analyses were applied to both data sets. The RADseq and the mRNAseq assemblies represented 0.1% and 0.6% of the genome, respectively. Genome complexity reduction resulted in 17% of the RADseq contigs potentially coding for proteins. This rate was doubled in the mRNAseq data set, suggesting that RADseq also explores noncoding low-repeat regions. The two methods gave very similar GO-slim profiles. As expected, the two assemblies were poor in TE-like sequences (
- Published
- 2014
- Full Text
- View/download PDF