1. Data mining-based model and risk prediction of colorectal cancer by using secondary health data: A systematic review
- Author
-
Lei Tao, Jiawei Bai, Hailun Liang, Lei Yang, Leiyu Shi, Wuyang Yang, Jiafu Ji, Da Zheng, and Ning Wang
- Subjects
Cancer Research ,Colorectal cancer ,business.industry ,Big data ,Decision tree ,MEDLINE ,Bayesian network ,colorectal cancer ,disease detection ,data mining ,Cochrane Library ,medicine.disease ,computer.software_genre ,Checklist ,Oncology ,Interquartile range ,Systematic review ,medicine ,Original Article ,Data mining ,business ,computer - Abstract
Objective Prevention and early detection of colorectal cancer (CRC) can increase the chances of successful treatment and reduce burden. Various data mining technologies have been utilized to strengthen the early detection of CRC in primary care. Evidence synthesis on the model’s effectiveness is scant. This systematic review synthesizes studies that examine the effect of data mining on improving risk prediction of CRC. Methods The PRISMA framework guided the conduct of this study. We obtained papers via PubMed, Cochrane Library, EMBASE and Google Scholar. Quality appraisal was performed using Downs and Black’s quality checklist. To evaluate the performance of included models, the values of specificity and sensitivity were comparted, the values of area under the curve (AUC) were plotted, and the median of overall AUC of included studies was computed. Results A total of 316 studies were reviewed for full text. Seven articles were included. Included studies implement techniques including artificial neural networks, Bayesian networks and decision trees. Six articles reported the overall model accuracy. Overall, the median AUC is 0.8243 [interquartile range (IQR): 0.8050−0.8886]. In the two articles that reported comparison results with traditional models, the data mining method performed better than the traditional models, with the best AUC improvement of 10.7%. Conclusions The adoption of data mining technologies for CRC detection is at an early stage. Limited numbers of included articles and heterogeneity of those studies implied that more rigorous research is expected to further investigate the techniques’ effects.
- Published
- 2020