Back to Search Start Over

Critical assessment of missense variant effect predictors on disease-relevant variant data.

Authors :
Rastogi R
Chung R
Li S
Li C
Lee K
Woo J
Kim DW
Keum C
Babbi G
Martelli PL
Savojardo C
Casadio R
Chennen K
Weber T
Poch O
Ancien F
Cia G
Pucci F
Raimondi D
Vranken W
Rooman M
Marquet C
Olenyi T
Rost B
Andreoletti G
Kamandula A
Peng Y
Bakolitsa C
Mort M
Cooper DN
Bergquist T
Pejaver V
Liu X
Radivojac P
Brenner SE
Ioannidis NM
Source :
BioRxiv : the preprint server for biology [bioRxiv] 2024 Jun 08. Date of Electronic Publication: 2024 Jun 08.
Publication Year :
2024

Abstract

Regular, systematic, and independent assessment of computational tools used to predict the pathogenicity of missense variants is necessary to evaluate their clinical and research utility and suggest directions for future improvement. Here, as part of the sixth edition of the Critical Assessment of Genome Interpretation (CAGI) challenge, we assess missense variant effect predictors (or variant impact predictors) on an evaluation dataset of rare missense variants from disease-relevant databases. Our assessment evaluates predictors submitted to the CAGI6 Annotate-All-Missense challenge, predictors commonly used by the clinical genetics community, and recently developed deep learning methods for variant effect prediction. To explore a variety of settings that are relevant for different clinical and research applications, we assess performance within different subsets of the evaluation data and within high-specificity and high-sensitivity regimes. We find strong performance of many predictors across multiple settings. Meta-predictors tend to outperform their constituent individual predictors; however, several individual predictors have performance similar to that of commonly used meta-predictors. The relative performance of predictors differs in high-specificity and high-sensitivity regimes, suggesting that different methods may be best suited to different use cases. We also characterize two potential sources of bias. Predictors that incorporate allele frequency as a predictive feature tend to have reduced performance when distinguishing pathogenic variants from very rare benign variants, and predictors supervised on pathogenicity labels from curated variant databases often learn label imbalances within genes. Overall, we find notable advances over the oldest and most cited missense variant effect predictors and continued improvements among the most recently developed tools, and the CAGI Annotate-All-Missense challenge (also termed the Missense Marathon) will continue to assess state-of-the-art methods as the field progresses. Together, our results help illuminate the current clinical and research utility of missense variant effect predictors and identify potential areas for future development.

Details

Language :
English
Database :
MEDLINE
Journal :
BioRxiv : the preprint server for biology
Publication Type :
Academic Journal
Accession number :
38895200
Full Text :
https://doi.org/10.1101/2024.06.06.597828