Back to Search Start Over

DEFLATE Compression Algorithm Corrects for Overestimation of Phylogenetic Diversity by Grantham Approach to Single-Nucleotide Polymorphism Classification.

Authors :
Schlosberg, Arran
Lam, Brian Y. H.
Yeo, Giles S. H.
Clifton-Bligh, Roderick J.
Source :
International Journal of Molecular Sciences; May2014, Vol. 15 Issue 5, p8491-8508, 18p
Publication Year :
2014

Abstract

Improvements in speed and cost of genome sequencing are resulting in increasing numbers of novel non-synonymous single nucleotide polymorphisms (nsSNPs) in genes known to be associated with disease. The large number of nsSNPs makes laboratory-based classification infeasible and familial co-segregation with disease is not always possible. In-silico methods for classification or triage are thus utilised. A popular tool based on multiple-species sequence alignments (MSAs) and work by Grantham, Align-GVGD, has been shown to underestimate deleterious effects, particularly as sequence numbers increase. We utilised the DEFLATE compression algorithm to account for expected variation across a number of species. With the adjusted Grantham measure we derived a means of quantitatively clustering known neutral and deleterious nsSNPs from the same gene; this was then used to assign novel variants to the most appropriate cluster as a means of binary classification. Scaling of clusters allows for inter-gene comparison of variants through a single pathogenicity score. The approach improves upon the classification accuracy of Align-GVGD while correcting for sensitivity to large MSAs. Open-source code and a web server are made available at https://github.com/aschlosberg/CompressGV. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
16616596
Volume :
15
Issue :
5
Database :
Complementary Index
Journal :
International Journal of Molecular Sciences
Publication Type :
Academic Journal
Accession number :
96249994
Full Text :
https://doi.org/10.3390/ijms15058491