Back to Search Start Over

Nanopore sequencing and assembly of a human genome with ultra-long reads

Authors :
Jain, M.
Koren, S.
Miga, K.H.
Quick, J.
Rand, A.C.
Sasani, T.A.
Tyson, J.R.
Beggs, A.D.
Dilthey, A.T.
Fiddes, I.T.
Malla, S.
Marriott, H.
Nieto, T.
O'Grady, J.
Olsen, H.E.
Pedersen, B.S.
Rhie, A.
Richardson, H.
Quinlan, A.R.
Snutch, T.P.
Tee, L.
Paten, B.
Phillippy, A.M.
Simpson, J.T.
Loman, N.J.
Loose, M.
Jain, M.
Koren, S.
Miga, K.H.
Quick, J.
Rand, A.C.
Sasani, T.A.
Tyson, J.R.
Beggs, A.D.
Dilthey, A.T.
Fiddes, I.T.
Malla, S.
Marriott, H.
Nieto, T.
O'Grady, J.
Olsen, H.E.
Pedersen, B.S.
Rhie, A.
Richardson, H.
Quinlan, A.R.
Snutch, T.P.
Tee, L.
Paten, B.
Phillippy, A.M.
Simpson, J.T.
Loman, N.J.
Loose, M.

Abstract

We report the sequencing and assembly of a reference genome for the human GM12878 Utah/Ceph cell line using the MinION (Oxford Nanopore Technologies) nanopore sequencer. 91.2 Gb of sequence data, representing ~30× theoretical coverage, were produced. Reference-based alignment enabled detection of large structural variants and epigenetic modifications. De novo assembly of nanopore reads alone yielded a contiguous assembly (NG50 ~3 Mb). Next, we developed a protocol to generate ultra-long reads (N50 > 100kb, up to 882 kb). Incorporating an additional 5×-coverage of these data more than doubled the assembly contiguity (NG50 ~6.4 Mb). The final assembled genome was 2,867 million bases in size, covering 85.8% of the reference. Assembly accuracy, after incorporating complementary short-read sequencing data, exceeded 99.8%. Ultra-long reads enabled assembly and phasing of the 4 Mb major histocompatibility complex (MHC) locus in its entirety, measurement of telomere repeat length and closure of gaps in the reference human genome assembly GRCh38.

Details

Database :
OAIster
Notes :
doi:10.1038/nbt.4060
Publication Type :
Electronic Resource
Accession number :
edsoai.on1312891250
Document Type :
Electronic Resource
Full Text :
https://doi.org/10.1038.nbt.4060