Back to Search
Start Over
Combining clustering and classification ensembles: A novel pipeline to identify breast cancer profiles
- Source :
- Agrawal, U, Soria, D, Wagner, C, Garibaldi, J, Ellis, I O, Bartlett, J M S, Cameron, D, Rakha, E A & Green, A R 2019, ' Combining Clustering and Classification Ensembles: A Novel Pipeline to Identify Breast Cancer Profiles ', Artificial Intelligence in Medicine . https://doi.org/10.1016/j.artmed.2019.05.002
- Publication Year :
- 2017
-
Abstract
- Breast Cancer is one of the most common causes of cancer death in women, representing a very complex disease with varied molecular alterations. To assist breast cancer prognosis, the classification of patients into biological groups is of great significance for treatment strategies. Recent studies have used an ensemble of multiple clustering algorithms to elucidate the most characteristic biological groups of breast cancer. However, the combination of various clustering methods resulted in a number of patients remaining unclustered. Therefore, a framework still needs to be developed which can assign as many unclustered (i.e. biologically diverse) patients to one of the identified groups in order to improve classification. Therefore, in this paper we develop a novel classification framework which introduces a new ensemble classification stage after the ensemble clustering stage to target the unclustered patients. Thus, a step-by-step pipeline is introduced which couples ensemble clustering with ensemble classification for the identification of core groups, data distribution in them and improvement in final classification results by targeting the unclustered data. The proposed pipeline is employed on a novel real world breast cancer dataset and subsequently its robustness and stability are examined by testing it on standard datasets. The results show that by using the presented framework, an improved classification is obtained. Finally, the results have been verified using statistical tests, visualisation techniques, cluster quality assessment and interpretation from clinical experts.
- Subjects :
- Computer science
Pipeline (computing)
Stability (learning theory)
Medicine (miscellaneous)
Datasets as Topic
Breast Neoplasms
Refining cluster results
Machine learning
computer.software_genre
03 medical and health sciences
0302 clinical medicine
Breast cancer
Artificial Intelligence
Robustness (computer science)
Pipeline
Ensemble Classification
Breast Cancer
medicine
Cluster Analysis
Humans
Cluster analysis
Class level fusion
030304 developmental biology
Statistical hypothesis testing
0303 health sciences
business.industry
R858
medicine.disease
Identification (information)
Visualisation techniques
Ensemble Clustering
Female
Artificial intelligence
Neural Networks, Computer
business
computer
030217 neurology & neurosurgery
Algorithms
Subjects
Details
- ISSN :
- 18732860 and 09333657
- Volume :
- 97
- Database :
- OpenAIRE
- Journal :
- Artificial intelligence in medicine
- Accession number :
- edsair.doi.dedup.....802cebfb22b404e9439d9e2287f809ed
- Full Text :
- https://doi.org/10.1016/j.artmed.2019.05.002