Back to Search
Start Over
iDBP-PBMD: A machine learning model for detection of DNA-binding proteins by extending compression techniques into evolutionary profile.
- Source :
-
Chemometrics & Intelligent Laboratory Systems . Dec2022, Vol. 231, pN.PAG-N.PAG. 1p. - Publication Year :
- 2022
-
Abstract
- DNA-binding protein (DBPs) has many crucial functions like DNA transcription, recombination, and replication. DBPs are highly associated with AIDS/HIV, cancer, and asthma, while other types of DBPs are the active ingredients in the designing of anti-inflammatory, steroids, and antibiotics. Many approaches were introduced for DBPs detection. It is still highly desirable to establish predictor with high precision. This research encodes different views of features by several PSSM-based feature descriptors including EDF-PSSM (evolutionary difference formula-based position-specific scoring matrix), FPSSM (Filtered position-specific scoring matrix), PSSM-MAB (position-specific scoring matrix-based multiple average blocks), KSB-PSSM (k-separated bigram position-specific scoring matrix), PSSM-AAC (position-specific scoring matrix-amino acid composition), RPSSM (Reduced position-specific scoring matrix), and PSSM-DPC (position-specific scoring matrix-dipeptide composition). The noisy and redundant features were eliminated by two compression techniques named DWT (discrete wavelet transform) and DCT (discrete cosine transform) The obtained features were concatenated and provided to XGBoost (eXtreme Gradient Boosting) and ERT (Extremely Randomized Trees) classifiers for training the model. The XGBoost-based model (iDBP-PMD) acquired the highest 2.72% and 1.24% accuracies on the training and testing datasets, respectively. The results confirmed the superiority of current study over previous approaches. • Designed a novel predictor, iDBP-PMD for discrimination of DNA-binding proteins. • The local & global discriminative information was discovered by PSSM-based different feature methods. • XGBoost was used for training model and prediction. • Our novel predictor achieved the highest success rate. [ABSTRACT FROM AUTHOR]
Details
- Language :
- English
- ISSN :
- 01697439
- Volume :
- 231
- Database :
- Academic Search Index
- Journal :
- Chemometrics & Intelligent Laboratory Systems
- Publication Type :
- Academic Journal
- Accession number :
- 160581529
- Full Text :
- https://doi.org/10.1016/j.chemolab.2022.104697