1. A Comparison of Artificial Intelligence and Human Diabetic Retinal Image Interpretation in an Urban Health System
- Author
-
Lorrie Cheng, Xiaoning Lu, Daohai Yu, Julia Grachevskaya, Yi Zhang, Nikita Mokhashi, and Jeffrey D Henderer
- Subjects
Telemedicine ,Computer science ,Endocrinology, Diabetes and Metabolism ,Biomedical Engineering ,030209 endocrinology & metabolism ,Bioengineering ,Sensitivity and Specificity ,Retina ,03 medical and health sciences ,0302 clinical medicine ,Artificial Intelligence ,Diabetes Mellitus ,Photography ,Internal Medicine ,medicine ,Humans ,Mass Screening ,Clinical Applications of Diabetes Technology ,Aged ,Diabetic Retinopathy ,Scope (project management) ,business.industry ,Interpretation (philosophy) ,Urban Health ,Diabetic retinopathy ,medicine.disease ,Retinal image ,030221 ophthalmology & optometry ,Artificial intelligence ,business ,Urban health - Abstract
Introduction: Artificial intelligence (AI) diabetic retinopathy (DR) software has the potential to decrease time spent by clinicians on image interpretation and expand the scope of DR screening. We performed a retrospective review to compare Eyenuk’s EyeArt software (Woodland Hills, CA) to Temple Ophthalmology optometry grading using the International Classification of Diabetic Retinopathy scale. Methods: Two hundred and sixty consecutive diabetic patients from the Temple Faculty Practice Internal Medicine clinic underwent 2-field retinal imaging. Classifications of the images by the software and optometrist were analyzed using sensitivity, specificity, positive predictive value (PPV), negative predictive value (NPV), and McNemar’s test. Ungradable images were analyzed to identify relationships with HbA1c, age, and ethnicity. Disagreements and a sample of 20% of agreements were adjudicated by a retina specialist. Results: On patient level comparison, sensitivity for the software was 100%, while specificity was 77.78%. PPV was 19.15%, and NPV was 100%. The 38 disagreements between software and optometrist occurred when the optometrist classified a patient’s images as non-referable while the software classified them as referable. Of these disagreements, a retina specialist agreed with the optometrist 57.9% the time (22/38). Of the agreements, the retina specialist agreed with both the program and the optometrist 96.7% of the time (28/29). There was a significant difference in numbers of ungradable photos in older patients (≥60) vs younger patients (Conclusions: The AI program showed high sensitivity with acceptable specificity for a screening algorithm. The high NPV indicates that the software is unlikely to miss DR but may refer patients unnecessarily.
- Published
- 2021
- Full Text
- View/download PDF