1. Accurate whole human genome sequencing using reversible terminator chemistry
- Author
-
Zoya Kingsbury, Marc Laurent, Jason Bryant, Konstantinos D. Diakoumakos, Klaus Maisinger, Louise Fraser, Jean Ernest Sohna Sohna, Adrian Horgan, Patrick Mccauley, Jane Rogers, David W. Elmore, Mark A. Osborne, Juying Yan, Mark Smith, Milan Fedurco, Gary P. Schroth, Belen Dominguez-Fernandez, Heng Li, Andrea Sabot, Suzanne Wakelin, Cindy Lawley, Carole Anastasi, David Klenerman, David George, Daniel P. Pliskin, Mohammed D. Alam, Svilen S. Tzonev, Mark T. Reed, Xiaohai Liu, Asha Boodhun, Lu Zhang, Aylwyn Scally, T. A. Huw Jones, Ugonna C. Egbujor, Tzvetana H. Kerelska, George Stefan Golda, Shankar Balasubramanian, Lukasz Szajkowski, Mitch Lok, Mitch K. Shiver, Paul McNitt, Simon Chang, Maria Q. Johnson, Gyoung-Dong Kang, Victor J. Quijano, Sarah E. Lee, Mike Zuerlein, Maria Candelaria Rogert Bacigalupo, Alan D. Kersey, Selena G. Barbour, Dirk J. Evers, Andrew C. Pike, Stephen Rawlings, Karin Fuentes Fajardo, Mirian S. Karbelashvili, Matthew E. Hurles, Sonia M. Novo, Xavier Lee, James C. Burrows, John Stephen West, Jingwen Wang, Ify C. Aniebo, Natasha R. Crake, Christian D. Haudenschild, Richard Shaw, Come Raczy, W. Scott Furey, Wu Xiaolin, Lambros L. Paraschos, Josefina M. Seoane, John W. Martin, Katya Hoschler, Raquel Maria Sanches-Kuiper, Nick J. McCooke, Colin Barnes, Johannes P. Sluis, Abass A. Bundu, John Milton, R. Keira Cheetham, Nancy F. Hansen, Clive Gavin Brown, Nigel P. Carter, Richard J. Carter, Chiara Rodighiero, Kim B. Stevens, Shujun Luo, Radhika M. Mammen, Phyllida M. Roe, Melanie Anne Smith, Bojan Obradovic, Johnny T. Ho, Jennifer A. Loch, Terena James, Harold Swerdlow, Dale Buermann, David E. Green, Steve Hurwitz, Joe W. Mullens, Ning Sizto, Frank L. Oaks, Eli Rusman, Natalie J. Rourke, Nikolai Romanov, Anthony J. Smith, Claire Bevis, Selene M. Virk, Ling Yau, Yuli Verhovsky, D. Chris Pinkard, Stephanie Vandevondele, Vincent Peter Smith, Rob C. Brown, Eric J. Spence, Joe Podhasky, Ana Chiva Rodriguez, Michael Lawrence Parkinson, Anthony Romieu, Joe S. Brennan, Rithy K. Roth, David Mark Dunstan Bailey, Roberto Rigatti, Anil Kumar, Phillip J. Black, Primo Baybayan, Saibal Banerjee, Matthew M. Hims, Arnold Liao, R. Neil Cooley, Omead Ostadan, Vincent A. Benoit, Andrew A. Brown, Silke Ruediger, Leslie J. Irving, Parul Mehta, James C. Mullikin, Klaudia Walter, John Rogers, Jonathan Mark Boutell, Alex P. Kindwall, Paula Kokko-Gonzales, Alger C. Pike, Michael J. O'Neill, Eric Vermaas, Subramanian V. Sankar, Sean Humphray, Steven W. Short, Gerardo Turcatti, Helen Bignell, Kimberley J. Gietzen, Peta E. Torrance, Narinder I. Heyer, David James Earnshaw, Kevin Hall, Martin R. Schenker, Richard Durbin, Philip A. Granieri, Tobias William Barr Ost, Iain R. Bancarz, Lea Pickering, David L. Gustafson, Peter Lundberg, Niall Anthony Gormley, John Bridgham, Andrew Osnowski, Scott M. Kirk, Mark R. Ewan, Keith W. Moon, Bee Ling Ng, Graham John Worsley, Anthony J. Cox, Olubunmi O. Dada, Gregory C. Walcott, Sergey Etchin, Irina Khrebtukova, Kevin Benson, Vicki H. Rae, Zemin Ning, Carolyn Tregidgo, Nestor Castillo, Colin P. Goddard, Taksina Newington, Denis V. Ivanov, Anastassia Spiridou, Maria Chiara E. Catenazzi, Neil Sutton, Kevin Harnish, Darren James Ellis, Lisa Murray, Geoffrey Paul Smith, Mark T. Ross, David R. Bentley, M. R. Pratt, Isabelle Rasolonjatovo, and Michael R. Flatbush
- Subjects
Male ,Genotype ,2 base encoding ,Nigeria ,Sequence assembly ,Hybrid genome assembly ,Genomics ,Computational biology ,Biology ,Polymorphism, Single Nucleotide ,Sensitivity and Specificity ,Deep sequencing ,Article ,03 medical and health sciences ,0302 clinical medicine ,Consensus Sequence ,Humans ,Paired-end tag ,030304 developmental biology ,Genetics ,Whole genome sequencing ,Chromosomes, Human, X ,0303 health sciences ,Multidisciplinary ,Genome, Human ,DNA sequencing theory ,Sequence Analysis, DNA ,030220 oncology & carcinogenesis - Abstract
DNA sequence information underpins genetic research, enabling discoveries of important biological or medical benefit. Sequencing projects have traditionally used long (400-800 base pair) reads, but the existence of reference sequences for the human and many other genomes makes it possible to develop new, fast approaches to re-sequencing, whereby shorter reads are compared to a reference to identify intraspecies genetic variation. Here we report an approach that generates several billion bases of accurate nucleotide sequence per experiment at low cost. Single molecules of DNA are attached to a flat surface, amplified in situ and used as templates for synthetic sequencing with fluorescent reversible terminator deoxyribonucleotides. Images of the surface are analysed to generate high-quality sequence. We demonstrate application of this approach to human genome sequencing on flow-sorted X chromosomes and then scale the approach to determine the genome sequence of a male Yoruba from Ibadan, Nigeria. We build an accurate consensus sequence from >30x average depth of paired 35-base reads. We characterize four million single-nucleotide polymorphisms and four hundred thousand structural variants, many of which were previously unknown. Our approach is effective for accurate, rapid and economical whole-genome re-sequencing and many other biomedical applications.