Back to Search Start Over

Whole genome and exome sequencing reference datasets from a multi-center and cross-platform benchmark study

Authors :
Bao Tran
Erich Jaeger
Sulbha Choudhari
Daniela Gasparotto
Yuliya Kriga
Sulev Kõks
Kenneth Idler
Keyur Talsania
Petr Vojta
Zhong Chen
Charles Wang
Jiri Drabek
Wanqiu Chen
Yuanting Zheng
Daoud Meerzaman
Christopher E. Mason
Roberta Maestro
Leming Shi
Ene Reimann
Tsai-wei Shen
Charles Lu
Jonathan Foox
Xiongfong Chen
Chunlin Xiao
Luyao Ren
Wenming Xiao
Tiffany Hung
Eric Peters
Marc Sultan
Andreas Scherer
Bin Zhu
Yongmei Zhao
Virginie Petitjean
Jyoti Shetty
Huixiao Hong
Jessica Nordlund
Ulrika Liljedahl
Li Tai Fang
Institute for Molecular Medicine Finland
Source :
Scientific Data, Vol 8, Iss 1, Pp 1-14 (2021), Scientific Data
Publication Year :
2021

Abstract

With the rapid advancement of sequencing technologies, next generation sequencing (NGS) analysis has been widely applied in cancer genomics research. More recently, NGS has been adopted in clinical oncology to advance personalized medicine. Clinical applications of precision oncology require accurate tests that can distinguish tumor-specific mutations from artifacts introduced during NGS processes or data analysis. Therefore, there is an urgent need to develop best practices in cancer mutation detection using NGS and the need for standard reference data sets for systematically measuring accuracy and reproducibility across platforms and methods. Within the SEQC2 consortium context, we established paired tumor-normal reference samples and generated whole-genome (WGS) and whole-exome sequencing (WES) data using sixteen library protocols, seven sequencing platforms at six different centers. We systematically interrogated somatic mutations in the reference samples to identify factors affecting detection reproducibility and accuracy in cancer genomes. These large cross-platform/site WGS and WES datasets using well-characterized reference samples will represent a powerful resource for benchmarking NGS technologies, bioinformatics pipelines, and for the cancer genomics studies.<br />Measurement(s)Somatic Mutation AnalysisTechnology Type(s)whole genome sequencing • Whole Exome SequencingFactor Type(s)sequencing platform • sample prepration • library preparation • bioinformatics methodSample Characteristic - OrganismHomo sapiens Machine-accessible metadata file describing the reported data: 10.6084/m9.figshare.16713655

Details

Language :
English
Database :
OpenAIRE
Journal :
Scientific Data, Vol 8, Iss 1, Pp 1-14 (2021), Scientific Data
Accession number :
edsair.doi.dedup.....50da38baf6dea5f68d7766a7d6f0bfc0