Back to Search Start Over

Haplotype-resolved assemblies and variant benchmark of a Chinese Quartet

Authors :
Peng Jia
Lianhua Dong
Xiaofei Yang
Bo Wang
Tingjie Wang
Jiadong Lin
Songbo Wang
Xixi Zhao
Tun Xu
Yizhuo Che
Ningxin Dang
Luyao Ren
Yujing Zhang
Xia Wang
Fan Liang
Yang Wang
Jue Ruan
Yuanting Zheng
Leming Shi
Jing Wang
Kai Ye
Publication Year :
2022
Publisher :
Research Square Platform LLC, 2022.

Abstract

As the state-of-the-art sequencing technologies and computational methods enable investigation of challenging regions in the human genome, an update variant benchmark is demanded. Herein, we sequenced a Chinese Quartet, consisting of two monozygotic twin daughters and their biological parents, with multiple advanced sequencing platforms, including Illumina, BGI, PacBio, and Oxford Nanopore Technology. We phased the long reads of the monozygotic twin daughters into paternal and maternal haplotypes using the parent-child genetic map. For each haplotype, we utilized advanced long reads to generate haplotype-resolved assemblies (HRAs) with high accuracy, completeness, and continuity. Based on the ingenious quartet samples, novel computational methods, high-quality sequencing reads, and HRAs, we established a comprehensive variant benchmark, including 3,883,283 SNVs, 859,256 Indels, 9,678 large deletions, 15,324 large insertions, 40 inversions, and 31 complex structural variants shared between the monozygotic twin daughters. In particular, the preciously excluded regions, such as repeat regions and the human leukocyte antigen (HLA) region, were systematically examined. Finally, we illustrated how the sequencing depth correlated with the de novo assembly and variant detection, from which we learned that 30 × HiFi is a balance between performance and cost. In summary, this study provides high-quality haplotype-resolved assemblies and a variant benchmark for two Chinese monozygotic twin samples. The benchmark expanded the regions of the previous report and adapted to the evolving sequencing technologies and computational methods.

Details

Database :
OpenAIRE
Accession number :
edsair.doi...........3e4f1ab499d2a670feca67ffef8a3d74