Back to Search Start Over

De novo assembly of the Indian blue peacock (Pavo cristatus) genome using Oxford Nanopore technology and Illumina sequencing

Authors :
Sunil Singh
Sovon Acharya
Karthikeyan Pethusamy
Vishwajeet Rohil
Sandeep Goswami
Rakesh Singh
Ashikh Seethy
Ruby Dhar
Balaji Rajashekhar
Ankita Raj
Kakali Purkayastha
Tryambak Srivastava
Subhradip Karmakar
Indrani Mukherjee
Source :
GigaScience
Publication Year :
2019
Publisher :
Oxford University Press (OUP), 2019.

Abstract

Background The Indian peafowl (Pavo cristanus) is native to South Asia and is the national bird of India. Here we present a draft genome sequence of the male blue peacock using Illumina and Oxford Nanopore technology (ONT). Results ONT sequencing gave ∼2.3-fold sequencing coverage, whereas Illumina generated 150–base pair paired-end sequence data at 284.6-fold coverage from 5 libraries. Subsequently, we generated a 0.915-gigabase pair de novo assembly of the peacock genome with a scaffold N50 of 0.23 megabase pairs (Mb). We predict that the peacock genome contains 23,153 protein-coding genes and 75.3 Mb (7.33%) of repetitive sequences. Conclusions We report a high-quality assembly of the peacock genome using a hybrid approach of sequences generated by both Illumina and ONT. The long-read chemistry generated by ONT was useful for addressing challenges related to de novo assembly, particularly at regions containing repetitive sequences spanning longer than the read length, and which could not be resolved with only short-read–based assembly. Contig assembly of Illumina short reads gave an N50 of 1,639 bases, whereas with ONT, the N50 increased by >9-fold to 14,749 bases. The initial contig assembly based on Illumina sequencing reads alone gave 685,241 contigs. Further scaffolding on assembled contigs using both Illumina and ONT sequencing reads resulted in a final assembly of 15,025 super-scaffolds, with an N50 of ∼0.23 Mb. Ninety-five percent of proteins predicted by homology matched with those in a public repository, verifying the completeness of our assembly. Like other phylogenetic studies of avian conserved genes, we found P. cristatus to be most closely related to Gallus gallus, followed by Meleagris gallopavo and Anas platyrhynchos. Compared with the recently published peacock genome assembly, the current, superior, hybrid assembly has greater sequencing depth, fewer non-ATGC sequences, and fewer scaffolds.

Details

ISSN :
2047217X
Volume :
8
Database :
OpenAIRE
Journal :
GigaScience
Accession number :
edsair.doi.dedup.....fd1b1f0e75828a1ed8ba6ce5e3765525
Full Text :
https://doi.org/10.1093/gigascience/giz038