Back to Search Start Over

A comparison of tools for copy-number variation detection in germline whole exome and whole genome sequencing data

Authors :
Migle Gabrielaite
Mathias Husted Torp
Malthe Sebro Rasmussen
Sergio Andreu-Sánchez
Filipe Garrett Vieira
Christina Bligaard Pedersen
Savvas Kinalis
Majbritt Busk Madsen
Miyako Kodama
Gül Sude Demircan
Arman Simonyan
Christina Westmose Yde
Lars Rønn Olsen
Rasmus L. Marvig
Olga Østrup
Maria Rossing
Finn Cilius Nielsen
Ole Winther
Frederik Otzen Bagger
Source :
Gabrielaite, M, Torp, M H, Rasmussen, M S, Andreu-Sánchez, S, Vieira, F G, Pedersen, C B, Kinalis, S, Madsen, M B, Kodama, M, Demircan, G S, Simonyan, A, Yde, C W, Olsen, L R, Marvig, R L, Østrup, O, Rossing, M, Nielsen, F C, Winther, O & Bagger, F O 2021, ' A comparison of tools for copy-number variation detection in germline whole exome and whole genome sequencing data ', Cancers, vol. 13, no. 24, 6283 . https://doi.org/10.3390/cancers13246283, Cancers, Cancers, Vol 13, Iss 6283, p 6283 (2021), Cancers; Volume 13; Issue 24; Pages: 6283
Publication Year :
2021

Abstract

Simple Summary Copy-number variations (CNVs) have important clinical implications for several diseases and cancers. We reviewed 50 popular CNV calling tools and included 11 tools for benchmarking in a reference cohort encompassing 39 whole genome sequencing (WGS) samples paired current clinical standard—SNP-array based CNV calling. For nine samples we also performed whole exome sequencing (WES), to address the effect of sequencing protocol on CNV calling. Furthermore, we included Gold Standard reference sample NA12878, and tested 12 samples with CNVs confirmed by multiplex ligation-dependent probe amplification (MLPA). Tool performance varied greatly in the number of called CNVs and bias for CNV lengths. Some tools had near-perfect recall of CNVs from arrays for some samples, but poor precision. We suggest combining the best tools also based on different methodologies: GATK gCNV, Lumpy, DELLY, and cn.MOPS. Abstract Copy-number variations (CNVs) have important clinical implications for several diseases and cancers. Relevant CNVs are hard to detect because common structural variations define large parts of the human genome. CNV calling from short-read sequencing would allow single protocol full genomic profiling. We reviewed 50 popular CNV calling tools and included 11 tools for benchmarking in a reference cohort encompassing 39 whole genome sequencing (WGS) samples paired current clinical standard—SNP-array based CNV calling. Additionally, for nine samples we also performed whole exome sequencing (WES), to address the effect of sequencing protocol on CNV calling. Furthermore, we included Gold Standard reference sample NA12878, and tested 12 samples with CNVs confirmed by multiplex ligation-dependent probe amplification (MLPA). Tool performance varied greatly in the number of called CNVs and bias for CNV lengths. Some tools had near-perfect recall of CNVs from arrays for some samples, but poor precision. Several tools had better performance for NA12878, which could be a result of overfitting. We suggest combining the best tools also based on different methodologies: GATK gCNV, Lumpy, DELLY, and cn.MOPS. Reducing the total number of called variants could potentially be assisted by the use of background panels for filtering of frequently called variants.

Details

Language :
English
Database :
OpenAIRE
Journal :
Gabrielaite, M, Torp, M H, Rasmussen, M S, Andreu-Sánchez, S, Vieira, F G, Pedersen, C B, Kinalis, S, Madsen, M B, Kodama, M, Demircan, G S, Simonyan, A, Yde, C W, Olsen, L R, Marvig, R L, Østrup, O, Rossing, M, Nielsen, F C, Winther, O & Bagger, F O 2021, ' A comparison of tools for copy-number variation detection in germline whole exome and whole genome sequencing data ', Cancers, vol. 13, no. 24, 6283 . https://doi.org/10.3390/cancers13246283, Cancers, Cancers, Vol 13, Iss 6283, p 6283 (2021), Cancers; Volume 13; Issue 24; Pages: 6283
Accession number :
edsair.doi.dedup.....6dffc695f2124c4f9d087c5f60bf12ca