1. An Improved Process for Generating Uniform PSSMs and Its Application in Protein Subcellular Localization via Various Global Dimension Reduction Techniques
- Author
-
Shunfang Wang, Wenjia Li, Yu Fei, Zicheng Cao, Dongshu Xu, and Huanyu Guo
- Subjects
Dimensional reduction ,feature expression ,linear discriminant analysis ,protein subcellular localization ,segmented amino acid composition in PSSM ,Electrical engineering. Electronics. Nuclear engineering ,TK1-9971 - Abstract
This paper proposes an improved protein feature expression called segmented amino acid composition in position-specific scoring matrix (PSSM-SAA) in the field of subcellular localization prediction. Since there has been sufficient local information in the PSSM-SAA vector with high dimensionality, four global algorithms of dimensional reduction are suggested, including linear discriminant analysis (LDA), median LDA (MDA), generalized Fisher discriminant analysis (GDA), and median-mean line-based discriminant analysis (MMLDA). PSSM-SAA is also compared with three important expressions: PSSM-S, DipCPSSM, and PsePSSM. Numerical experiments involving the overall success rate (OSR) show that PSSM-SAA is much better than PSSM-S and DipCPSSM and slightly better than or equal in performance to PsePSSM regardless of which dimension reduction algorithm is used. LDA is finally recommended for PSSM-SAA through comparison among four techniques of dimensional reduction. Other popular evaluation indexes also confirm the effectiveness of PSSM-SAA with LDA. Next, the suggested model is compared with the state-of-the-art predictors to further evaluate its validity. Finally, a new user-friendly local software for implementing PSSM-SAA is provided, which can be found at https://www.github.com/caozicheng/PSSMSAA-Builder.
- Published
- 2019
- Full Text
- View/download PDF