1. SMALF: miRNA-disease associations prediction based on stacked autoencoder and XGBoost
- Author
-
Jiaxuan Zhang, Wenjuan Nie, Dayun Liu, Lei Deng, and Yibiao Huang
- Subjects
Computer science ,QH301-705.5 ,Feature vector ,Computer applications to medicine. Medical informatics ,R858-859.7 ,Breast Neoplasms ,Disease ,Computational biology ,Latent feature ,Biochemistry ,03 medical and health sciences ,0302 clinical medicine ,Molecular level ,Semantic similarity ,Structural Biology ,microRNA ,Humans ,Biology (General) ,miRNA-disease associations ,Molecular Biology ,030304 developmental biology ,0303 health sciences ,Applied Mathematics ,Research ,Computational Biology ,Autoencoder ,Stacked autoencoder ,Computer Science Applications ,MicroRNAs ,Feature (computer vision) ,030220 oncology & carcinogenesis ,DNA microarray ,Algorithms ,XGBoost - Abstract
Background Identifying miRNA and disease associations helps us understand disease mechanisms of action from the molecular level. However, it is usually blind, time-consuming, and small-scale based on biological experiments. Hence, developing computational methods to predict unknown miRNA and disease associations is becoming increasingly important. Results In this work, we develop a computational framework called SMALF to predict unknown miRNA-disease associations. SMALF first utilizes a stacked autoencoder to learn miRNA latent feature and disease latent feature from the original miRNA-disease association matrix. Then, SMALF obtains the feature vector of representing miRNA-disease by integrating miRNA functional similarity, miRNA latent feature, disease semantic similarity, and disease latent feature. Finally, XGBoost is utilized to predict unknown miRNA-disease associations. We implement cross-validation experiments. Compared with other state-of-the-art methods, SAMLF achieved the best AUC value. We also construct three case studies, including hepatocellular carcinoma, colon cancer, and breast cancer. The results show that 10, 10, and 9 out of the top ten predicted miRNAs are verified in MNDR v3.0 or miRCancer, respectively. Conclusion The comprehensive experimental results demonstrate that SMALF is effective in identifying unknown miRNA-disease associations.
- Published
- 2021