1. DDLSNet: A Novel Deep Learning-Based System for Grading Funduscopic Images for Glaucomatous Damage
- Author
-
Rasheed, Haroon Adam, Davis, Tyler, Morales, Esteban, Fei, Zhe, Grassi, Lourdes, De Gainza, Agustina, Nouri-Mahdavi, Kouros, and Caprioli, Joseph
- Subjects
Biomedical and Clinical Sciences ,Ophthalmology and Optometry ,Bioengineering ,Prevention ,Clinical Research ,DDLS ,Glaucoma ,Neural network ,Segmentation ,ARW ,absent rim width ,CI ,confidence interval ,DDLS ,disc damage likelihood scale ,DiscNet ,disc size classification model ,MAE ,mean average error ,ODP ,optic disc photograph ,RimIoU ,rim intersection over union ,RimNet ,rim segmentation model ,mRDR ,minimum rim-to-disc ratio - Abstract
PurposeTo report an image analysis pipeline, DDLSNet, consisting of a rim segmentation (RimNet) branch and a disc size classification (DiscNet) branch to automate estimation of the disc damage likelihood scale (DDLS).DesignRetrospective observational.ParticipantsRimNet and DiscNet were developed with 1208 and 11 536 optic disc photographs (ODPs), respectively. DDLSNet performance was evaluated on 120 ODPs from the RimNet test set, for which the DDLS scores were graded by clinicians. Reproducibility was evaluated on a group of 781 eyes, each with 2 ODPs taken within 4 years apart.MethodsDisc damage likelihood scale calculation requires estimation of optic disc size, provided by DiscNet (VGG19 network), and the minimum rim-to-disc ratio (mRDR) or absent rim width (ARW), provided by RimNet (InceptionV3/LinkNet segmentation model). To build RimNet's dataset, glaucoma specialists marked optic disc rim and cup boundaries on ODPs. The "ground truth" mRDR or ARW was calculated. For DiscNet's dataset, corresponding OCT images provided "ground truth" disc size. Optic disc photographs were split into 80/10/10 for training, validation, and testing, respectively, for RimNet and DiscNet. DDLSNet estimation was tested against manual grading of DDLS by clinicians with the average score used as "ground truth." Reproducibility of DDLSNet grading was evaluated by repeating DDLS estimation on a dataset of nonprogressing paired ODPs taken at separate times.Main outcome measuresThe main outcome measure was a weighted kappa score between clinicians and the DDLSNet pipeline with agreement defined as ± 1 DDLS score difference.ResultsRimNet achieved an mRDR mean absolute error (MAE) of 0.04 (± 0.03) and an ARW MAE of 48.9 (± 35.9) degrees when compared to clinician segmentations. DiscNet achieved 73% (95% confidence interval [CI]: 70%, 75%) classification accuracy. DDLSNet achieved an average weighted kappa agreement of 0.54 (95% CI: 0.40, 0.68) compared to clinicians. Average interclinician agreement was 0.52 (95% CI: 0.49, 0.56). Reproducibility testing demonstrated that 96% of ODP pairs had a difference of ≤ 1 DDLS score.ConclusionsDDLSNet achieved moderate agreement with clinicians for DDLS grading. This novel approach illustrates the feasibility of automated ODP grading for assessing glaucoma severity. Further improvements may be achieved by increasing the number of incomplete rims sample size, expanding the hyperparameter search, and increasing the agreement of clinicians grading ODPs.
- Published
- 2023