1. High-throughput long paired-end sequencing of a Fosmid library by PacBio
- Author
-
Zhaozhao Dai, Jiadong Li, Sha Tang, Xianmin Diao, Meizhong Luo, Tong Li, Zhifei Han, and Yonglong Pan
- Subjects
0301 basic medicine ,Sequence assembly ,Plant Science ,Computational biology ,Ampicillin resistance gene tag ,Biology ,lcsh:Plant culture ,Genome ,Insert (molecular biology) ,03 medical and health sciences ,0302 clinical medicine ,Cloning Site ,Genetics ,De novo assembly ,Assembly error ,lcsh:SB1-1110 ,Replicon ,lcsh:QH301-705.5 ,Paired-end tag ,Long paired-end ,Segmental duplication ,PacBio ,Methodology ,Mate-pair ,Fosmid ,030104 developmental biology ,lcsh:Biology (General) ,030220 oncology & carcinogenesis ,Structural rearrangement ,Biotechnology - Abstract
Background Large insert paired-end sequencing technologies are important tools for assembling genomes, delineating associated breakpoints and detecting structural rearrangements. To facilitate the comprehensive detection of inter- and intra-chromosomal structural rearrangements or variants (SVs) and complex genome assembly with long repeats and segmental duplications, we developed a new method based on single-molecule real-time synthesis sequencing technology for generating long paired-end sequences of large insert DNA libraries. Results A Fosmid vector, pHZAUFOS3, was developed with the following new features: (1) two 18-bp non-palindromic I-SceI sites flank the cloning site, and another two sites are present in the skeleton of the vector, allowing long DNA inserts (and the long paired-ends in this paper) to be recovered as single fragments and the vector (~ 8 kb) to be fragmented into 2–3 kb fragments by I-SceI digestion and therefore was effectively removed from the long paired-ends (5–10 kb); (2) the chloramphenicol (Cm) resistance gene and replicon (oriV), necessary for colony growth, are located near the two sides of the cloning site, helping to increase the proportion of the paired-end fragments to single-end fragments in the paired-end libraries. Paired-end libraries were constructed by ligating the size-selected, mechanically sheared pooled Fosmid DNA fragments to the Ampicillin (Amp) resistance gene fragment and screening the colonies with Cm and Amp. We tested this method on yeast and Setaria italica Yugu1. Fosmid-size paired-ends with an average length longer than 2 kb for each end were generated. The N50 scaffold lengths of the de novo assemblies of the yeast and S. italica Yugu1 genomes were significantly improved. Five large and five small structural rearrangements or assembly errors spanning tens of bp to tens of kb were identified in S. italica Yugu1 including deletions, inversions, duplications and translocations. Conclusions We developed a new method for long paired-end sequencing of large insert libraries, which can efficiently improve the quality of de novo genome assembly and identify large and small structural rearrangements or assembly errors.
- Published
- 2019
- Full Text
- View/download PDF