Back to Search Start Over

iDBP-PBMD: A machine learning model for detection of DNA-binding proteins by extending compression techniques into evolutionary profile.

Authors :
Banjar, Ameen
Ali, Farman
Alghushairy, Omar
Daud, Ali
Source :
Chemometrics & Intelligent Laboratory Systems. Dec2022, Vol. 231, pN.PAG-N.PAG. 1p.
Publication Year :
2022

Abstract

DNA-binding protein (DBPs) has many crucial functions like DNA transcription, recombination, and replication. DBPs are highly associated with AIDS/HIV, cancer, and asthma, while other types of DBPs are the active ingredients in the designing of anti-inflammatory, steroids, and antibiotics. Many approaches were introduced for DBPs detection. It is still highly desirable to establish predictor with high precision. This research encodes different views of features by several PSSM-based feature descriptors including EDF-PSSM (evolutionary difference formula-based position-specific scoring matrix), FPSSM (Filtered position-specific scoring matrix), PSSM-MAB (position-specific scoring matrix-based multiple average blocks), KSB-PSSM (k-separated bigram position-specific scoring matrix), PSSM-AAC (position-specific scoring matrix-amino acid composition), RPSSM (Reduced position-specific scoring matrix), and PSSM-DPC (position-specific scoring matrix-dipeptide composition). The noisy and redundant features were eliminated by two compression techniques named DWT (discrete wavelet transform) and DCT (discrete cosine transform) The obtained features were concatenated and provided to XGBoost (eXtreme Gradient Boosting) and ERT (Extremely Randomized Trees) classifiers for training the model. The XGBoost-based model (iDBP-PMD) acquired the highest 2.72% and 1.24% accuracies on the training and testing datasets, respectively. The results confirmed the superiority of current study over previous approaches. • Designed a novel predictor, iDBP-PMD for discrimination of DNA-binding proteins. • The local & global discriminative information was discovered by PSSM-based different feature methods. • XGBoost was used for training model and prediction. • Our novel predictor achieved the highest success rate. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
01697439
Volume :
231
Database :
Academic Search Index
Journal :
Chemometrics & Intelligent Laboratory Systems
Publication Type :
Academic Journal
Accession number :
160581529
Full Text :
https://doi.org/10.1016/j.chemolab.2022.104697