Hong, Cheng, Chernyak, Victoria, Choi, Jin-Young, Lee, Sonia, Potu, Chetan, Delgado, Timoteo, Wolfson, Tanya, Gamst, Anthony, Birnbaum, Jason, Kampalath, Rony, Lall, Chandana, Lee, James, Owen, Joseph, Aguirre, Diego, Mendiratta-Lala, Mishal, Davenport, Matthew, Masch, William, Roudenko, Alexandra, Lewis, Sara, Kierans, Andrea, Hecht, Elizabeth, Bashir, Mustafa, Brancatelli, Giuseppe, Douek, Michael, Ohliger, Michael, Tang, An, Cerny, Milena, Fung, Alice, Costa, Eduardo, Corwin, Michael, Mcgahan, John, Kalb, Bobby, Elsayes, Khaled, Surabhi, Venkateswar, Blair, Katherine, Marks, Robert, Horvat, Natally, Best, Shaun, Ash, Ryan, Ganesan, Karthik, Kagay, Christopher, Kambadakone, Avinash, Wang, Jin, Cruite, Irene, Bijan, Bijan, Goodwin, Mark, Moura Cunha, Guilherme, Tamayo-Murillo, Dorathy, Fowler, Kathryn, and Sirlin, Claude
Background Various limitations have impacted research evaluating reader agreement for Liver Imaging Reporting and Data System (LI-RADS). Purpose To assess reader agreement of LI-RADS in an international multicenter multireader setting using scrollable images. Materials and Methods This retrospective study used deidentified clinical multiphase CT and MRI and reports with at least one untreated observation from six institutions and three countries; only qualifying examinations were submitted. Examination dates were October 2017 to August 2018 at the coordinating center. One untreated observation per examination was randomly selected using observation identifiers, and its clinically assigned features were extracted from the report. The corresponding LI-RADS version 2018 category was computed as a rescored clinical read. Each examination was randomly assigned to two of 43 research readers who independently scored the observation. Agreement for an ordinal modified four-category LI-RADS scale (LR-1, definitely benign; LR-2, probably benign; LR-3, intermediate probability of malignancy; LR-4, probably hepatocellular carcinoma [HCC]; LR-5, definitely HCC; LR-M, probably malignant but not HCC specific; and LR-TIV, tumor in vein) was computed using intraclass correlation coefficients (ICCs). Agreement was also computed for dichotomized malignancy (LR-4, LR-5, LR-M, and LR-TIV), LR-5, and LR-M. Agreement was compared between research-versus-research reads and research-versus-clinical reads. Results The study population consisted of 484 patients (mean age, 62 years ± 10 [SD]; 156 women; 93 CT examinations, 391 MRI examinations). ICCs for ordinal LI-RADS, dichotomized malignancy, LR-5, and LR-M were 0.68 (95% CI: 0.61, 0.73), 0.63 (95% CI: 0.55, 0.70), 0.58 (95% CI: 0.50, 0.66), and 0.46 (95% CI: 0.31, 0.61) respectively. Research-versus-research reader agreement was higher than research-versus-clinical agreement for modified four-category LI-RADS (ICC, 0.68 vs 0.62, respectively; P = .03) and for dichotomized malignancy (ICC, 0.63 vs 0.53, respectively; P = .005), but not for LR-5 (P = .14) or LR-M (P = .94). Conclusion There was moderate agreement for LI-RADS version 2018 overall. For some comparisons, research-versus-research reader agreement was higher than research-versus-clinical reader agreement, indicating differences between the clinical and research environments that warrant further study. © RSNA, 2023 Supplemental material is available for this article. See also the editorials by Johnson and Galgano and Smith in this issue.