1. Contrastive Multiple Instance Learning: An Unsupervised Framework for Learning Slide-Level Representations of Whole Slide Histopathology Images without Labels.
- Author
-
Tavolara, Thomas E., Gurcan, Metin N., and Niazi, M. Khalid Khan
- Subjects
- *
LUNG cancer , *REGRESSION analysis , *CATHETER ablation , *LEARNING strategies , *CONCEPTUAL structures , *DIAGNOSTIC imaging , *CELL proliferation , *HISTOLOGY - Abstract
Simple Summary: Recent AI methods in the automated analysis of histopathological imaging data associated with cancer have trended towards less supervision by humans. Yet, there are circumstances when humans cannot lend a hand to AI. Hence, we present an unsupervised method to learn meaningful features from histopathological imaging data. We applied our method to non-small cell lung cancer subtyping as a classification prototype and breast cancer proliferation scoring as a regression prototype. Our AI method achieves high accuracy and correlation, respectively. Additional experiments aimed at reducing the amount of available data demonstrated that the learned features are robust. Overall, our AI method approaches the analysis of histopathological imaging data in a novel manner, where meaningful features can be learned from without the need for any supervision my humans. The proposed method stands to benefit the field, as it theoretically enables researchers to benefit from completely raw histopathology imaging data. Recent methods in computational pathology have trended towards semi- and weakly-supervised methods requiring only slide-level labels. Yet, even slide-level labels may be absent or irrelevant to the application of interest, such as in clinical trials. Hence, we present a fully unsupervised method to learn meaningful, compact representations of WSIs. Our method initially trains a tile-wise encoder using SimCLR, from which subsets of tile-wise embeddings are extracted and fused via an attention-based multiple-instance learning framework to yield slide-level representations. The resulting set of intra-slide-level and inter-slide-level embeddings are attracted and repelled via contrastive loss, respectively. This resulted in slide-level representations with self-supervision. We applied our method to two tasks— (1) non-small cell lung cancer subtyping (NSCLC) as a classification prototype and (2) breast cancer proliferation scoring (TUPAC16) as a regression prototype—and achieved an AUC of 0.8641 ± 0.0115 and correlation (R2) of 0.5740 ± 0.0970, respectively. Ablation experiments demonstrate that the resulting unsupervised slide-level feature space can be fine-tuned with small datasets for both tasks. Overall, our method approaches computational pathology in a novel manner, where meaningful features can be learned from whole-slide images without the need for annotations of slide-level labels. The proposed method stands to benefit computational pathology, as it theoretically enables researchers to benefit from completely unlabeled whole-slide images. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF