1. Variant Classification Discordance: Contributing Factors and Predictive Models.
- Author
-
Ghaedi H, Davey SK, and Feilotter H
- Subjects
- Humans, Gene Frequency, Alleles, Laboratories, Genetic Variation, Databases, Genetic
- Abstract
An ever-growing catalog of human variants is hosted in the ClinVar database. In this database, submissions on a variant are combined into a multisubmitter record; and in the case of discordance in variant classification between submitters, the record is labeled as conflicting. The current study used ClinVar data to identify characteristics that would make variants more likely to be associated with the conflict class of variants. Furthermore, the Extreme Gradient Boosting algorithm was used to train classifier models to provide prediction of classification discordance for single submission variants in ClinVar database. Population allele frequency, the gene harboring the variant, variant type, consequence on protein, variant deleteriousness score, first submitter identity, and submission count were associated with conflict in variant classification. Using such features, the optimized classifier showed accuracy on the test set of 88% with the weighted average of precision, recall, and f1-score of 0.84, 0.88, and 0.85, respectively. There were pronounced associations between variant classification discordance and allele frequency, gene type, and the identity of the first submitter. The study provides the predicted discordance status for single-submitter variants deposited in ClinVar. This approach can be used to assess whether single-submitter variants are likely to be supported, or in conflict with, future entries; this knowledge may help laboratories with clinical variant assessment., Competing Interests: Disclosure Statement None declared., (Copyright © 2024 Association for Molecular Pathology and American Society for Investigative Pathology. Published by Elsevier Inc. All rights reserved.)
- Published
- 2024
- Full Text
- View/download PDF