151. The determination of appropriate coefficient indices for inter-rater reliability: Using classroom observation instruments as fidelity measures in large-scale randomized research.
- Author
-
Tong, Fuhui, Tang, Shifang, Irby, Beverly J., Lara-Alecio, Rafael, and Guerrero, Cindy
- Subjects
- *
EDUCATIONAL programs , *LIKERT scale , *OBSERVATION (Educational method) - Abstract
• Misuse of coefficient indices for inter-rater reliability of observation instruments in the literature. • Existence of kappa paradox through illustration with empirical data from randomized research. • Gwet's adjusted AC 1 as a robust coefficient index for multi-dimension-multi-response nominal scale. • Intraclass correlation as a robust index for Likert-scale with ordinal data. • A fully-crossed rating design with eight raters rating all randomly-selected video clips. In this study, we address an under-studied challenge related to the accurate use and reporting of coefficient indices for inter-rater reliability (IRR) of classroom observation instruments used to measure fidelity of implementation (FOI) in educational programs serving the needs of English learners. Through empirical data obtained from a randomized controlled trial, we confirmed the existence of kappa paradox. We further determined that Gwet's AC1 was a robust index for a multiple-dimension-multi-response protocol with nominal data and intraclass correlation was most appropriate for a Likert-rating scale with ordinal data and multiple raters. We highlight the misuse of indices in the literature and call for a careful consideration of robust indices critical to ensure accuracy of observational data and FOI. [ABSTRACT FROM AUTHOR]
- Published
- 2020
- Full Text
- View/download PDF