1. A new method to accurately identify single nucleotide variants using small FFPE breast samples
- Author
-
Angelo Fortunato, E. Shelley Hwang, Allison Hall, Diego Mallo, Joseph Y. Lo, Jeffrey R. Marks, Lorraine M. King, Carlo C. Maley, Timothy Hardman, and Shawn M. Rupp
- Subjects
AcademicSubjects/SCI01060 ,DCIS ,Breast Neoplasms ,Computational biology ,Biology ,Genome ,Polymorphism, Single Nucleotide ,DNA sequencing ,Workflow ,chemistry.chemical_compound ,Genetic Heterogeneity ,03 medical and health sciences ,Breast cancer ,0302 clinical medicine ,medicine ,Breast ductal carcinoma ,Biomarkers, Tumor ,Humans ,Sign test ,Nucleotide ,Genetic Testing ,Indel ,Molecular Biology ,Exome ,Exome sequencing ,030304 developmental biology ,chemistry.chemical_classification ,0303 health sciences ,Histopathological analysis ,Cancer ,Computational Biology ,High-Throughput Nucleotide Sequencing ,DNA, Neoplasm ,Ductal carcinoma ,medicine.disease ,Invasive ductal carcinoma ,chemistry ,NGS ,030220 oncology & carcinogenesis ,Mutation ,Problem Solving Protocol ,Female ,heterogeneity ,DNA ,exome ,Information Systems - Abstract
Most tissue collections of neoplasms are composed of formalin-fixed and paraffin-embedded (FFPE) excised tumor samples used for routine diagnostics. DNA sequencing is becoming increasingly important in cancer research and clinical management; however, it is difficult to accurately sequence DNA from FFPE samples. We developed and validated a new bioinformatic algorithm to robustly identify somatic single nucleotide variants (SNVs) from whole exome sequencing using small amounts of DNA extracted from archival FFPE samples of breast cancers. We optimized this strategy using 28 pairs of technical replicates. After optimization, the mean similarity between replicates increased 5-fold, reaching 88% (range 0-100%), with a mean of 21.4 SNVs (range 1-68) per sample, representing a markedly superior performance to existing algorithms. We found that the SNV-identification accuracy declined when there was less than 40ng of DNA available and that insertion-deletion variant calls are less reliable than single base substitutions. As the first application of the new algorithm, we compared samples of ductal carcinoma in situ (DCIS) of the breast to their adjacent invasive ductal carcinoma (IDC) samples. We observed an increased number of mutations (paired-samples sign test, pKey PointsThe sequencing of reduced quantities of DNA extracted from FFPE samples leads to substantial sequencing errors that require correction in order to obtain accurate detection of somatic mutations.We developed and validated a new bioinformatic algorithm to robustly identify somatic single nucleotide variants using small amounts of DNA extracted from archival FFPE samples of breast cancers.Variant calling software packages need to be optimized to reduce the impact of sequencing errors. Our bioinformatics pipeline represents a significant methodological advance compared to the currently available bioinformatic tools used for the analysis of small FFPE samples.
- Published
- 2021