Back to Search
Start Over
Predicting HIV drug resistance using weighted machine learning method at target protein sequence-level.
- Source :
-
Molecular diversity [Mol Divers] 2021 Aug; Vol. 25 (3), pp. 1541-1551. Date of Electronic Publication: 2021 Jul 09. - Publication Year :
- 2021
-
Abstract
- Acquired immune deficiency syndrome (AIDS) is a fatal disease caused by human immunodeficiency virus (HIV). Although 23 different drugs have been available, the treatment of AIDS remains challenging because the virus mutates very quickly which can lead to drug resistance. Therefore, predicting drug resistance before treatment is crucial for individual treatments. Here, based on HIV target protein sequence information, we analyzed 21-drug resistance caused by mutated residues using machine learning (ML) methods. To transform target sequences into numeric vectors, seven physicochemical properties were used, which can well represent the interacting characteristics of target proteins. Then, principal component analysis (PCA) method was adopted to reduce the feature dimensionality. Random forest (RF) and support vector machine (SVM) based on three different kernel functions, including linear, polynomial and radial basis function (RBF), were all employed. By comparisons, we found that RBF-based SVM method gives a comparative performance with RF model. Further, we added the weight information to RBF-based SVM method by four different weight evaluation methods of RF, eXtreme Gradient Boosting (XGB), CfsSubsetEval and ReliefFAttributeEval, respectively. Results show that the RF-weighted RBF-based SVM yield the superior performance and 13 out of 21 drug models provide the correlation coefficients (R <superscript>2</superscript> ) over 0.8 and 3 of them are higher than 0.9. Finally, position-specific importance analysis indicates that most of the mutation residues with high RF weight scores are proved to be closely related with drug resistance, which has been revealed in previous reports. Overall, we can expect that this method can be a supplementary tool for predicting HIV drug resistance for newly discovered mutations. Here, based on HIV target protein sequence information, we analyzed 21-drug resistance caused by mutated residues using machine learning (ML) methods by fusing the weight information of different mutation positions.<br /> (© 2021. The Author(s), under exclusive licence to Springer Nature Switzerland AG.)
- Subjects :
- Algorithms
Amino Acid Sequence
Databases, Factual
Dose-Response Relationship, Drug
Humans
Mutation
Reproducibility of Results
Support Vector Machine
Viral Proteins genetics
Anti-HIV Agents chemistry
Anti-HIV Agents pharmacology
Drug Resistance, Viral
HIV drug effects
Machine Learning
Models, Theoretical
Viral Proteins chemistry
Subjects
Details
- Language :
- English
- ISSN :
- 1573-501X
- Volume :
- 25
- Issue :
- 3
- Database :
- MEDLINE
- Journal :
- Molecular diversity
- Publication Type :
- Academic Journal
- Accession number :
- 34241771
- Full Text :
- https://doi.org/10.1007/s11030-021-10262-y