Back to Search
Start Over
Fully Automatic Deep Learning in Bi-institutional Prostate Magnetic Resonance Imaging: Effects of Cohort Size and Heterogeneity
- Source :
- Investigative radiology. 56(12)
- Publication Year :
- 2021
-
Abstract
- BACKGROUND The potential of deep learning to support radiologist prostate magnetic resonance imaging (MRI) interpretation has been demonstrated. PURPOSE The aim of this study was to evaluate the effects of increased and diversified training data (TD) on deep learning performance for detection and segmentation of clinically significant prostate cancer-suspicious lesions. MATERIALS AND METHODS In this retrospective study, biparametric (T2-weighted and diffusion-weighted) prostate MRI acquired with multiple 1.5-T and 3.0-T MRI scanners in consecutive men was used for training and testing of prostate segmentation and lesion detection networks. Ground truth was the combination of targeted and extended systematic MRI-transrectal ultrasound fusion biopsies, with significant prostate cancer defined as International Society of Urological Pathology grade group greater than or equal to 2. U-Nets were internally validated on full, reduced, and PROSTATEx-enhanced training sets and subsequently externally validated on the institutional test set and the PROSTATEx test set. U-Net segmentation was calibrated to clinically desired levels in cross-validation, and test performance was subsequently compared using sensitivities, specificities, predictive values, and Dice coefficient. RESULTS One thousand four hundred eighty-eight institutional examinations (median age, 64 years; interquartile range, 58-70 years) were temporally split into training (2014-2017, 806 examinations, supplemented by 204 PROSTATEx examinations) and test (2018-2020, 682 examinations) sets. In the test set, Prostate Imaging-Reporting and Data System (PI-RADS) cutoffs greater than or equal to 3 and greater than or equal to 4 on a per-patient basis had sensitivity of 97% (241/249) and 90% (223/249) at specificity of 19% (82/433) and 56% (242/433), respectively. The full U-Net had corresponding sensitivity of 97% (241/249) and 88% (219/249) with specificity of 20% (86/433) and 59% (254/433), not statistically different from PI-RADS (P > 0.3 for all comparisons). U-Net trained using a reduced set of 171 consecutive examinations achieved inferior performance (P < 0.001). PROSTATEx training enhancement did not improve performance. Dice coefficients were 0.90 for prostate and 0.42/0.53 for MRI lesion segmentation at PI-RADS category 3/4 equivalents. CONCLUSIONS In a large institutional test set, U-Net confirms similar performance to clinical PI-RADS assessment and benefits from more TD, with neither institutional nor PROSTATEx performance improved by adding multiscanner or bi-institutional TD.
- Subjects :
- Male
Magnetic Resonance Spectroscopy
Prostate cancer
Deep Learning
Sørensen–Dice coefficient
Interquartile range
Prostate
Medicine
Humans
Radiology, Nuclear Medicine and imaging
Retrospective Studies
medicine.diagnostic_test
business.industry
Ultrasound
Prostatic Neoplasms
Retrospective cohort study
Magnetic resonance imaging
General Medicine
Middle Aged
medicine.disease
Magnetic Resonance Imaging
medicine.anatomical_structure
Test set
business
Nuclear medicine
Subjects
Details
- ISSN :
- 15360210
- Volume :
- 56
- Issue :
- 12
- Database :
- OpenAIRE
- Journal :
- Investigative radiology
- Accession number :
- edsair.doi.dedup.....f7165be25201a4d3062917eaaff03701