Shafaghi Afshin, Dehbozorgian Marzie, Omidvar Bita, Akbari Roghaye, Sadat Mohamad Ali, Vakili Hasan, Kasmaee Vahid, Javadzade Hamid, Pishbin Elham, Molaee Nezar, Abadi Ali Arhami, Abbasi Hamidreza, Kojuri Javad, Moghadami Mohsen, Amini Mitra, Jafari Mohammad, Monajemi Alireza, Arabshahi Kamran, Adibi Peyman, and Charlin Bernard
Abstract Background Clinical reasoning plays a major role in the ability of doctors to make a diagnosis and reach treatment decisions. This paper describes the use of four clinical reasoning tests in the second National Medical Science Olympiad in Iran: key features (KF), script concordance (SCT), clinical reasoning problems (CRP) and comprehensive integrative puzzles (CIP). The purpose of the study was to design a multi instrument for multiple roles approach in clinical reasoning field based on the theoretical framework, KF was used to measure data gathering, CRP was used to measure hypothesis formation, SCT and CIP were used to measure hypothesis evaluation and investigating the combined use of these tests in the Olympiad. A bank of clinical reasoning test items was developed for emergency medicine by a scientific expert committee representing all the medical schools in the country. These items were pretested by a reference group and the results were analyzed to select items that could be omitted. Then 135 top-ranked medical students from 45 medical universities in Iran participated in the clinical domain of the Olympiad. The reliability of each test was calculated by Cronbach's alpha. Item difficulty and the correlation between each item and the total score were measured. The correlation between the students' final grade and each of the clinical reasoning tests was calculated, as was the correlation between final grades and another measure of knowledge, i.e., the students' grade point average. Results The combined reliability for all four clinical reasoning tests was 0.91. Of the four clinical reasoning tests we compared, reliability was highest for CIP (0.91). The reliability was 0.83 for KF, 0.78 for SCT and 0.71 for CRP. Most of the tests had an acceptable item difficulty level between 0.2 and 0.8. The correlation between the score for each item and the total test score for each of the four tests was positive. The correlations between scores for each test and total score were highest for KF and CIP. The correlation between scores for each test and grade point average was low to intermediate for all four of the tests. Conclusion The combination of these four clinical reasoning tests is a reliable evaluation tool that can be implemented to assess clinical reasoning skills in talented undergraduate medical students, however these data may not generalizable to whole medical students population. The CIP and KF tests showed the greatest potential to measure clinical reasoning skills. Grade point averages did not necessarily predict performance in the clinical domain of the national competitive examination for medical school students.