1. Abstract PD11-02: Subtyping invasive carcinomas and high-risk lesions for machine learning based breast pathology
- Author
-
Matthew G Hanna, Patricia Raciti, Alican Bozkurt, Ran Godrich, Julian Viret, Donghun Lee, Philippe Mathieu, Matthew Lee, Eugene Vorontsov, Tomer Sabo, Felipe C Geyer, Jorge S Reis-Filho, Leo Grady, Thomas Fuchs, and Christopher Kanan
- Subjects
Cancer Research ,Oncology - Abstract
Background. The female mammary gland can develop a myriad of epithelial proliferative lesions including, high risk lesions, in-situ and invasive carcinomas. Identification of these pre-neoplastic and neoplastic conditions in biopsy specimens is crucial for proper patient management and may sometimes pose diagnostic challenges for pathologists. Recent research has shown that machine learning algorithms applied to whole slide images (WSI) can accurately detect and grade various cancers; herein, we devise and test a system that classifies the most common preneoplastic and neoplastic conditions of the female breast from WSIs. Design. De-identified slides were scanned on Leica AT2 whole slide scanners (20x; 0.5 µm/pixel) from MSK database were retrieved. Clinical diagnostic metadata were extracted from the pathology reports. Using a multi-label multiple-instance learning (ML-MIL) approach, a SE-ResNet50 Convolutional Neural Network (CNN) was trained to classify atypical lobular hyperplasia (ALH), atypical ductal hyperplasia (ADH), lobular carcinoma in situ (LCIS), ductal carcinoma in situ (DCIS), invasive lobular carcinoma (ILC), invasive ductal carcinoma (IDC). In additional morphological subtypes including apocrine, mucinous, solid papillary, micropapillary, and tubular carcinoma were trained. The system uses the WSI as an input and outputs a slide level class and heatmap for the presence of the trained classes. A validation dataset separate from the training set was used to assess performance of the trained model. Results. The CNN was trained on 9,751 surgical specimens (biopsy, 6,289; excision, 3,462) comprising 40,637 slides. The system was validated on 3,183 breast specimens (biopsy, 1,934; excision, 1,249) comprising 11,447 digital slides that were not included in the training of the CNN model. Validation performance in terms of Area Under Receiver Operating Characteristic Curve (AUROC) for each class is shown in Table 1. Conclusion. The trained CNN had a high performance in identifying the presence of ADH, ALH, DCIS, IDC, ILC, LCIS, and, apocrine, micropapillary, mucinous, solid papillary, and tubular carcinomas. Further studies expanding classes to include all clinically relevant lesions and morphologies are underway. In addition, the same approach can be used to detect microinvasions and calcifications in breast tissue. Table 1.Area Under Receiver Operating Characteristic Curve for Breast Lesion ClassesClassNum. Positive (specimens)Num. Negative (specimens)AUROCADH24718640.903ALH20918860.950LCIS17520080.958DCIS81920920.956IDC52126620.956ILC7131120.934Apocrine2431590.931Micropapillary17230110.927Mucinous1231710.994Solid Papillary1531680.908Tubular carcinoma831750.990 Citation Format: Matthew G Hanna, Patricia Raciti, Alican Bozkurt, Ran Godrich, Julian Viret, Donghun Lee, Philippe Mathieu, Matthew Lee, Eugene Vorontsov, Tomer Sabo, Felipe C Geyer, Jorge S Reis-Filho, Leo Grady, Thomas Fuchs, Christopher Kanan. Subtyping invasive carcinomas and high-risk lesions for machine learning based breast pathology [abstract]. In: Proceedings of the 2021 San Antonio Breast Cancer Symposium; 2021 Dec 7-10; San Antonio, TX. Philadelphia (PA): AACR; Cancer Res 2022;82(4 Suppl):Abstract nr PD11-02.
- Published
- 2022
- Full Text
- View/download PDF