Back to Search
Start Over
De novo assembly of the Indian blue peacock (Pavo cristatus) genome using Oxford Nanopore technology and Illumina sequencing
- Source :
- GigaScience
- Publication Year :
- 2019
- Publisher :
- Oxford University Press (OUP), 2019.
-
Abstract
- Background The Indian peafowl (Pavo cristanus) is native to South Asia and is the national bird of India. Here we present a draft genome sequence of the male blue peacock using Illumina and Oxford Nanopore technology (ONT). Results ONT sequencing gave ∼2.3-fold sequencing coverage, whereas Illumina generated 150–base pair paired-end sequence data at 284.6-fold coverage from 5 libraries. Subsequently, we generated a 0.915-gigabase pair de novo assembly of the peacock genome with a scaffold N50 of 0.23 megabase pairs (Mb). We predict that the peacock genome contains 23,153 protein-coding genes and 75.3 Mb (7.33%) of repetitive sequences. Conclusions We report a high-quality assembly of the peacock genome using a hybrid approach of sequences generated by both Illumina and ONT. The long-read chemistry generated by ONT was useful for addressing challenges related to de novo assembly, particularly at regions containing repetitive sequences spanning longer than the read length, and which could not be resolved with only short-read–based assembly. Contig assembly of Illumina short reads gave an N50 of 1,639 bases, whereas with ONT, the N50 increased by >9-fold to 14,749 bases. The initial contig assembly based on Illumina sequencing reads alone gave 685,241 contigs. Further scaffolding on assembled contigs using both Illumina and ONT sequencing reads resulted in a final assembly of 15,025 super-scaffolds, with an N50 of ∼0.23 Mb. Ninety-five percent of proteins predicted by homology matched with those in a public repository, verifying the completeness of our assembly. Like other phylogenetic studies of avian conserved genes, we found P. cristatus to be most closely related to Gallus gallus, followed by Meleagris gallopavo and Anas platyrhynchos. Compared with the recently published peacock genome assembly, the current, superior, hybrid assembly has greater sequencing depth, fewer non-ATGC sequences, and fewer scaffolds.
- Subjects :
- 0106 biological sciences
Pavo cristatus
Sequence assembly
Health Informatics
Data Note
01 natural sciences
Genome
Homology (biology)
Deep sequencing
Avian Proteins
03 medical and health sciences
Animals
Galliformes
peacock
Phylogeny
Illumina dye sequencing
030304 developmental biology
Whole genome sequencing
0303 health sciences
Whole Genome Sequencing
Contig
Molecular Sequence Annotation
Indian national bird
Computer Science Applications
Nanopore Sequencing
Oxford Nanopore
Evolutionary biology
genome assembly
Nanopore sequencing
010606 plant biology & botany
Subjects
Details
- ISSN :
- 2047217X
- Volume :
- 8
- Database :
- OpenAIRE
- Journal :
- GigaScience
- Accession number :
- edsair.doi.dedup.....fd1b1f0e75828a1ed8ba6ce5e3765525
- Full Text :
- https://doi.org/10.1093/gigascience/giz038