Back to Search Start Over

Using off-target data from whole-exome sequencing to improve genotyping accuracy, association analysis and polygenic risk prediction

Authors :
Dermot F. Reilly
Minghui Jiang
Shanshan Cheng
Xueling Sim
Xiaoran Chai
Chaolong Wang
E. Shyong Tai
Jinzhuang Dou
Degang Wu
Jianjun Liu
Kai Wang
Lin Ding
Source :
Briefings in bioinformatics. 22(3)
Publication Year :
2020

Abstract

Whole-exome sequencing (WES) has been widely used to study the role of protein-coding variants in genetic diseases. Non-coding regions, typically covered by sparse off-target data, are often discarded by conventional WES analyses. Here, we develop a genotype calling pipeline named WEScall to analyse both target and off-target data. We leverage linkage disequilibrium shared within study samples and from an external reference panel to improve genotyping accuracy. In an application to WES of 2527 Chinese and Malays, WEScall can reduce the genotype discordance rate from 0.26% (SE= 6.4 × 10−6) to 0.08% (SE = 3.6 × 10−6) across 1.1 million single nucleotide polymorphisms (SNPs) in the deeply sequenced target regions. Furthermore, we obtain genotypes at 0.70% (SE = 3.0 × 10−6) discordance rate across 5.2 million off-target SNPs, which had ~1.2× mean sequencing depth. Using this dataset, we perform genome-wide association studies of 10 metabolic traits. Despite of our small sample size, we identify 10 loci at genome-wide significance (P

Details

ISSN :
14774054
Volume :
22
Issue :
3
Database :
OpenAIRE
Journal :
Briefings in bioinformatics
Accession number :
edsair.doi.dedup.....e0b1cf672538898066f2f7cff8da665a