Descriptor: "Educational Measurement methods" - Searchworks@Jio Institute Digital Library Search Results

Your search keyword '"Educational Measurement methods"' showing total 11,368 results

Start Over Descriptor "Educational Measurement methods"

11,368 results on '"Educational Measurement methods"'

1. Comparing the performance of ChatGPT-3.5-Turbo, ChatGPT-4, and Google Bard with Iranian students in pre-internship comprehensive exams.

Author: Zare S, Vafaeian S, Amini M, Farhadi K, Vali M, and Golestani A
Subjects: Iran, Humans, Language, Educational Measurement methods, Internship and Residency, Students, Medical
Abstract: This study aims to measure the performance of different AI-language models in three sets of pre-internship medical exams and to compare their performance with Iranian medical students. Three sets of Persian pre-internship exams were used, along with their English translation (six sets in total). In late September 2023, we sent requests to ChatGPT-3.5-Turbo-0613, GPT-4-0613, and Google Bard in both Persian and English languages (excluding questions with any visual content) with each query in a new session and reviewed their responses. GPT models produced responses at varying levels of randomness. In both Persian and English tests, GPT-4 ranked first and obtained the highest score in all exams and different levels of randomness. While Google Bard scored below average on the Persian exams (still in an acceptable range), ChatGPT-3.5 failed all exams. There was a significant difference between the Large Language Models (LLMs) in Persian exams. While GPT-4 yielded the best scores on the English exams, the distinction between all LLMs and students was not statistically significant. The GPT-4 model outperformed students and other LLMs in medical exams, highlighting its potential application in the medical field. However, more research is needed to fully understand and address the limitations of using these models., Competing Interests: Declarations Competing interests The authors declare no competing interests., (© 2024. The Author(s).)
Published: 2024
Full Text: View/download PDF

2. The Effect of Virtual Laboratories on the Academic Achievement of Undergraduate Chemistry Students: Quasi-Experimental Study.

Author: Bazie H, Lemma B, Workneh A, and Estifanos A
Subjects: Humans, Male, Universities, Female, Young Adult, Educational Measurement methods, Virtual Reality, Students psychology, Students statistics & numerical data, Chemistry education, Laboratories, Academic Success
Abstract: Background: Experimentation is crucial in chemistry education as it links practical experience with theoretical concepts. However, practical chemistry courses typically rely on real laboratory experiments and often face challenges such as limited resources, equipment shortages, and logistical constraints in university settings. To address these challenges, computer-based laboratories have been introduced as a potential solution, offering electronic simulations that replicate real laboratory experiences., Objective: This study examines the effect of virtual laboratories on the academic achievement of undergraduate chemistry students and evaluates their potential as a viable alternative or complement to traditional laboratory-based instruction., Methods: A quasi-experimental design was implemented to examine the cause-and-effect relationship between instructional methods and student outcomes. The study involved 60 fourth-year BSc chemistry students from Dilla University, divided into 3 groups: a real laboratory group (n=20), which performed real laboratory experiments; a virtual group (n=20), which used virtual laboratory simulations; and a lecture group (n=20), which received lecture-based instruction. Quantitative data were collected through tests administered before and after the intervention to assess academic performance. The data analysis used descriptive and inferential statistics, such as means and SDs, 1-way ANOVA, the Tukey honestly significant difference test, and independent-sample t tests (2-tailed), with a P value of .05 set for determining statistical significance., Results: Before the intervention, the results indicated no significant differences in academic achievement among the 3 groups (P=.99). However, after the intervention, notable differences were observed in student performance across the methods. The real laboratory group had the highest mean posttest score (mean 62.6, SD 10.7), followed by the virtual laboratory group (mean 55.5, SD 6.8) and the lecture-only group, which had the lowest mean score (mean 43.7, SD 11.5). ANOVA results confirmed significant differences between the groups (F 2,57 =18.429; P<.001). The Tukey post hoc test further revealed that the real laboratory group significantly outperformed the lecture-only group (mean difference 18.88; P<.001), while the virtual laboratory group also performed significantly better than the lecture-only group (mean difference 11.7; P=.001). However, no statistically significant difference was found between the real laboratory and virtual laboratory groups (mean difference 7.12; P=.07). In addition, gender did not significantly influence performance in the virtual laboratory group (P=.21), with no substantial difference in posttest scores between male and female students., Conclusions: These findings suggest that computer-based laboratories are a viable and effective alternative when real laboratories are unavailable, enhancing learning outcomes when compared with traditional lecture-based methods. Therefore, universities should consider integrating computer-based laboratories into their practical chemistry curricula to provide students with interactive and engaging learning experiences, especially when physical laboratories are inaccessible., (©Hiwot Bazie, Bekele Lemma, Anteneh Workneh, Ashebir Estifanos. Originally published in JMIR Formative Research (https://formative.jmir.org), 15.11.2024.)
Published: 2024
Full Text: View/download PDF

3. Nursing students' experiences of professional competence evaluation by Objective Structured Clinical examination method: a qualitative content analysis study.

Author: Alizadeh M, Behshid M, Cheraghi R, and Dehghani G
Subjects: Humans, Clinical Competence standards, Female, Education, Nursing, Baccalaureate, Male, Professional Competence standards, Young Adult, Students, Nursing psychology, Qualitative Research, Educational Measurement methods
Abstract: Background: Clinical education is a significant part of medical education, and paying attention to clinical evaluation is important. One of the challenges in the teaching-learning process is evaluating students' performance. Learners, as the main stakeholders of the educational system, may have different experiences of evaluation quality. Awareness of these experiences is effective in improving the quality of clinical evaluation. Therefore, this study was conducted to "explore the experiences of nursing students for evaluation of professional competence by using the objective structured clinical examination (OSCE)"., Method: This study was conducted with a qualitative descriptive research approach and conventional content analysis method in 2022-2024. The participants included 12 undergraduate nursing students at Maragheh University of Medical Sciences, who were selected by purposeful sampling, and their experiences were collected using semi-structured and in-depth interviews until reaching data saturation., Results: The data analysis of the interviews led to the extraction of 268 primary codes, 7 subcategories, 2 categories, and 1 theme: "Credibility and stability". The category "Exam's accuracy in measuring competence " included 3 sub-categories: "Challenges in objective adaptation ", " Communication and organizational challenges for exam preparation " and " Inadequate simulation of stations and exam environment ", and the category " Exam power for repeatability" included 4 sub-categories: " Characteristics of the students ", " Lack of evaluators' skills and mastery ", " Inefficiency of the evaluation tool " and " Disturbance in executive affairs"., Conclusions: OSCE can be used in self-evaluation, creating motivation and strengthening different dimensions of students' learning, as well as discovering weaknesses and strengths for planning by managers and faculties. According to the results of this study, many factors such as management before and during the exam, characteristics of the evaluators, prevailing educational conditions in the faculty, and the method of clinical training are effective in achieving the "reliability and sustainability" required in the OSCE., Competing Interests: Declarations Ethics approval and consent to participate This article is the result of thesis number 68795 of the master’s course in medical education, which was approved by the research center for medical education of Tabriz University of Medical Sciences after obtaining ethical approval (IR.TBZMED.REC.1400.999). Written informed consent was provided by all participants. The participants were informed that they could leave the study at any step of the study and were informed that they could leave the study at any step of the study and allowed to record audio and take notes during the interview. They were assured that the recorded information, the observance, and analysis of the text of the interviews would be confidential. It was also announced that after the data analysis, the recorded voices would be omitted. The participants were reminded that if necessary, they may be referred again to complete the interviews, and if they wish, they can be informed of the general results of the study. Consent for publication Not applicable. Competing interests The authors declare no competing interests., (© 2024. The Author(s).)
Published: 2024
Full Text: View/download PDF

4. Leveraging large language models to construct feedback from medical multiple-choice Questions.

Author: Tomova M, Roselló Atanet I, Sehy V, Sieg M, März M, and Mäder P
Subjects: Humans, Language, Surveys and Questionnaires, Students, Medical, Feedback, Education, Medical methods, Formative Feedback, Educational Measurement methods
Abstract: Exams like the formative Progress Test Medizin can enhance their effectiveness by offering feedback beyond numerical scores. Content-based feedback, which encompasses relevant information from exam questions, can be valuable for students by offering them insight into their performance on the current exam, as well as serving as study aids and tools for revision. Our goal was to utilize Large Language Models (LLMs) in preparing content-based feedback for the Progress Test Medizin and evaluate their effectiveness in this task. We utilize two popular LLMs and conduct a comparative assessment by performing textual similarity on the generated outputs. Furthermore, we study via a survey how medical practitioners and medical educators assess the capabilities of LLMs and perceive the usage of LLMs for the task of generating content-based feedback for PTM exams. Our findings show that both examined LLMs performed similarly. Both have their own advantages and disadvantages. Our survey results indicate that one LLM produces slightly better outputs; however, this comes at a cost since it is a paid service, while the other is free to use. Overall, medical practitioners and educators who participated in the survey find the generated feedback relevant and useful, and they are open to using LLMs for such tasks in the future. We conclude that while the content-based feedback generated by the LLM may not be perfect, it nevertheless can be considered a valuable addition to the numerical feedback currently provided., Competing Interests: Declarations Competing interests The authors declare no competing interests., (© 2024. The Author(s).)
Published: 2024
Full Text: View/download PDF

5. International medical graduates' experiences of clinical competency assessment in postgraduate and licensing examinations: A scoping review protocol.

Author: Hynes H, Wiese A, McCarthy N, Sweeney C, Foley T, and Bennett D
Subjects: Humans, Educational Measurement methods, Educational Measurement standards, Licensure, Medical standards, Education, Medical, Graduate standards, Clinical Competence standards, Foreign Medical Graduates
Abstract: An international medical graduate (IMG) is a doctor who has received their basic medical qualification from a medical school located in a different country from that in which they practice or intend to practice. IMGs are known to face difficulties in their working lives, including differential attainment in assessment. The objective of this review is to map key concepts and types of evidence in academic and gray literature relating to international medical graduates' experiences of clinical competency assessment and to identify knowledge gaps on this topic by systematically searching, selecting, and synthesizing existing knowledge. All studies will relate to IMGs. The concept of interest will be IMGs' experiences of assessment. The context will be postgraduate, licensing or credentialing medical assessments of clinical competence. This review will be conducted in accordance with the Joanna Briggs Institute (JBI) methodology for scoping reviews. Seven electronic databases will be searched for literature published between 2009 and 2024: the Australian Education Index, British Education Index, ERIC, PubMed, PsycINFO, Scopus, and SocINDEX. Gray literature will be searched using Google, Google Scholar, and published reports from postgraduate training bodies and medical licensing organizations. Documents will be independently screened, selected, and extracted by two researchers using a piloted data-extraction tool. Data will be analyzed and presented in tables and in a narrative format. Trial resgistration: Scoping review registration: Open Science Framework: https://osf.io/8gdm7., Competing Interests: The authors have declared that no competing interests exist., (Copyright: © 2024 Hynes et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.)
Published: 2024
Full Text: View/download PDF

6. Development and validation of situational judgement test for assessment of behavioural competencies required for effective medical practice in Nigeria.

Author: Obi US, Chukwuma A, Agu I, Eigbiremolen G, and Mbachu C
Subjects: Humans, Nigeria, Male, Female, Educational Measurement methods, Adult, Reproducibility of Results, Clinical Competence standards, Judgment
Abstract: Background: The personal qualities of health workers determine the way health services are provided to clients. Some key personal qualities (also called behavioural competencies) of physicians that contribute to quality healthcare delivery include ethical responsibility, empathy, patient-centeredness, diligence, good judgment, respectful, teamwork, team leadership/conflict management, ability to take correction and tolerance. In this study, we developed and validated clinical scenarios (dilemmas) for assessing priority behavioural competencies for medical practice in Nigeria., Methods: Drawing on prioritized competencies generated via a scoping review and nominal group technique (NGT) exercises in a previous study in the study series, Faculty members from the University of Nigeria Teaching Hospital were consulted to develop or adapt clinical scenarios that could be used to assess these competencies in a physician. The clinical scenarios and options were framed as situational judgement tests (SJTs) and these tests were administered to a random sample of 192 undergraduate and 111 postgraduate medical doctors in a tertiary hospital in Enugu State. Using Kane's validity argument framework, we assessed scoring and generalization inferences of situational judgment tests (SJT) based on the developed scenarios., Result: Scoring inference - difficulty and discrimination index - shows that most of the SJT items are good test items and can differentiate between high and low performers. The corrected point biserial correlations show positive correlation for most of the items. Generalization inference shows the items represent the domains of interest and are internally consistent. However, few items that show poor difficulty and discrimination index were subjected to re-evaluation and possible elimination., Conclusion: This study has produced a set of valid clinical scenarios that can be used to evaluate specific behavioural competencies among trainee medical doctors. It demonstrates that SJTs can be used to assess behavioural competencies for medical practice. However, further research is needed to establish the applicability of SJT beyond the immediate context, such as the medical school, in which it is developed., Competing Interests: Declarations Ethics approval and consent to participate Ethical clearance was sought and obtained from the Health Research Ethics Committee of University of Nigeria Teaching Hospital (UNTH), Ituku-Ozalla. Written informed consent was obtained from all eligible participants having presented them with the purpose of research, their rights and measures to protect them and their data. Consent for publication Not applicable. Competing interests The authors declare no competing interests., (© 2024. The Author(s).)
Published: 2024
Full Text: View/download PDF

7. Professionalism assessment of students in clinical education through Situational Judgment Test (SJT).

Author: Hosseinpour A and Keshmiri F
Subjects: Humans, Cross-Sectional Studies, Female, Male, Reproducibility of Results, Young Adult, Students, Medical, Clinical Competence standards, Adult, Students, Health Occupations psychology, Judgment, Professionalism standards, Educational Measurement methods
Abstract: Background: The study aims to assess the situational judgment capability of students in various professions, including medicine, surgical nursing, anesthesia nursing, and emergency medical technology, using a validated and adapted Situational Judgment Test (SJT)., Methods: The cross-sectional study was conducted at Qom University of Medical Sciences in 2023-2024. The study consisted of two steps: (1) adaptation and validity assessment of the SJT in various health professions, and (2) evaluation of students' situational judgment capability using the adapted SJT. Participants included 207 students from surgical nursing, anesthesia nursing, medicine, and emergency medical technology departments, selected through stratified random sampling. The SJT included 10 constructed-response scenarios developed by Alkhuzaee and colleagues in 2022. The validation and adaptation process utilized consensus methods, and the SJT was modified to a selected response format based on a knowledge-based approach. Scores ranged from 0 to 30. Cohen's method and norm-referencing standard deviation were used to calculate a cut-off score. Data were analyzed using descriptive statistics (frequency, mean, standard deviation) and inferential statistics (independent t-test, ANOVA, and ANCOVA)., Results: The face and content validity of the adapted SJT with a selected-response format was confirmed by expert agreement. The reliability of the SJT was approved, with a Cronbach's Alpha of 0.81. The mean score of students' situational judgment in professionalism was 11.90 (SD 4.54). Students' professions significantly impacted their situational judgment scores (F (3.201) = 3.67, p = 0.01, Partial Eta Squared = 0.052)., Conclusion: The adaption and validation of SJT with a selected response format were confirmed by experts who evaluated the situational judgment capabilities of students in different clinical professions. The reliability of the SJT was acceptable-Cohen's method as a standard-setting approach assisted in identifying students needing support and training. The moderate level of situational judgment capacity confirmed the development of the educational program for the improvement of non-cognitive and non-technical skills such as judgment and decision-making in the field of professionalism., Competing Interests: Declarations Ethics approval and consent to participate This study was reviewed by the ethics committee of Shahid Sadoughi University of Medical Sciences and approved. (ID: IR.SSU.REC.1402.063). All participants were provided with information on the study and gave consent. The informed written consent forms were obtained from all participants. The work was conducted following the Declaration of Helsinki. Consent for publication ‘Not applicable’. Competing interests The authors declare no competing interests., (© 2024. The Author(s).)
Published: 2024
Full Text: View/download PDF

8. Consistency between inter-institutional panels using a three-level Angoff-standard setting in licensure tests of foreign-trained dentists in Sweden: A cohort study.

Author: Dalum J, Paulsson L, Christidis N, Andersson Franko M, Karlgren K, Leanderson C, and Sandborgh-Englund G
Subjects: Sweden, Humans, Cohort Studies, Educational Measurement methods, Educational Measurement standards, Licensure, Dental, Male, Foreign Professional Personnel, Licensure, Female, Dentists, Clinical Competence standards
Abstract: Licensure exams play a crucial role in ensuring the competence of individuals entering a profession, thereby safeguarding the public and maintaining the quality and integrity of the profession. In Sweden, dentists educated outside the European Union seeking to practise dentistry must undergoa re-certification process. The re-certification process includes a theoretical examination where pass marks are set using a three-level Angoff method. This study aimed to determine the consistency of the Angoff ratings using independent panels at two Swedish universities. Two cohorts of panellists were included in the study: one reference and one external. The reference panel was responsible for rating the upcoming theoretical examinations in the proficiency test, which were used to set the pass mark. The external panel, recruited from a dental school at a university in another region in Sweden, provided ratings after the examinations. Three examinations during 2019-2020 were included in this study (267 items in total). There was a strong correlation (ρ ≥ 0.70, p < .001) between the ratings of the two independent panels, with no significant differences in item ratings across the full exams, dental disciplines, and professional qualifications analysed. This suggests that the three-level Angoff method reliably produces similar standards for assessing the competence of the minimally qualified dentist across different institutions. The expectations of the minimally qualified but still acceptable dentist were comparable between the two independent panels across the three theoretical examinations explored. The alignment between the panels indicates valid, reliable standards across institutions, despite the independent syllabi of the two study programmes. However, while there is an alignment, differences in ratings remain. Consequently, involving multiple institutions in future standard-setting processes could help ensure that the standards reflect a broader range of educational practices, supporting the credibility of licensure examinations., Competing Interests: The authors have declared that no competing interests exist., (Copyright: © 2024 Dalum et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.)
Published: 2024
Full Text: View/download PDF

9. Research on virtual reality-based assessment framework and application path in medical education.

Author: Wang Y, Li Y, Chen C, Zhang W, Wang Y, Sha K, and Wang S
Subjects: Humans, Female, Learning, Male, Adult, Educational Measurement methods, Virtual Reality, Education, Medical methods
Abstract: While virtual reality(VR) technology enhances learning, it also places new demands on medical learning evaluation. Verifying the occurrence of learning is a primary issue. To design and implement practical and feasible VR-based learning evaluation based on the immersive learning evaluation framework, the Substitution-Augmentation-Modification-Redefinition (SAMR) model, a VR-based learning evaluation framework, was constructed. This framework included competency, learning objectives, assessment tasks, evaluation data, criteria, and feedback. A comprehensive application pathway was developed, utilizing technological integration frameworks. This pathway includes the selection and implementation processes to offer teachers theoretical direction on evaluating medical learning using VR. Finally, this study performed a learning evaluation utilizing VR. The findings revealed that using VR for evaluation can create a deeply engaging and interactive environment. Participants reported feeling a strong sense of being present in the virtual environment and expressed high acceptance and satisfaction with the VR evaluation process. Furthermore, they believed that VR evaluation offers a comprehensive and practical means of assessing cognitive abilities and receiving feedback. These findings establish that VR evaluation optimise learning assessment and showcase the feasibility of the assessment framework and application path., Competing Interests: The authors have declared that no competing interests exist., (Copyright: © 2024 Wang et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.)
Published: 2024
Full Text: View/download PDF

10. ChatGPT-4 Omni Performance in USMLE Disciplines and Clinical Skills: Comparative Analysis.

Author: Bicknell BT, Butler D, Whalen S, Ricks J, Dixon CJ, Clark AB, Spaedy O, Skelton A, Edupuganti N, Dzubinski L, Tate H, Dyess G, Lindeman B, and Lehmann LS
Subjects: Humans, United States, Clinical Clerkship, Clinical Competence standards, Educational Measurement methods, Licensure, Medical standards
Abstract: Background: Recent studies, including those by the National Board of Medical Examiners, have highlighted the remarkable capabilities of recent large language models (LLMs) such as ChatGPT in passing the United States Medical Licensing Examination (USMLE). However, there is a gap in detailed analysis of LLM performance in specific medical content areas, thus limiting an assessment of their potential utility in medical education., Objective: This study aimed to assess and compare the accuracy of successive ChatGPT versions (GPT-3.5, GPT-4, and GPT-4 Omni) in USMLE disciplines, clinical clerkships, and the clinical skills of diagnostics and management., Methods: This study used 750 clinical vignette-based multiple-choice questions to characterize the performance of successive ChatGPT versions (ChatGPT 3.5 [GPT-3.5], ChatGPT 4 [GPT-4], and ChatGPT 4 Omni [GPT-4o]) across USMLE disciplines, clinical clerkships, and in clinical skills (diagnostics and management). Accuracy was assessed using a standardized protocol, with statistical analyses conducted to compare the models' performances., Results: GPT-4o achieved the highest accuracy across 750 multiple-choice questions at 90.4%, outperforming GPT-4 and GPT-3.5, which scored 81.1% and 60.0%, respectively. GPT-4o's highest performances were in social sciences (95.5%), behavioral and neuroscience (94.2%), and pharmacology (93.2%). In clinical skills, GPT-4o's diagnostic accuracy was 92.7% and management accuracy was 88.8%, significantly higher than its predecessors. Notably, both GPT-4o and GPT-4 significantly outperformed the medical student average accuracy of 59.3% (95% CI 58.3-60.3)., Conclusions: GPT-4o's performance in USMLE disciplines, clinical clerkships, and clinical skills indicates substantial improvements over its predecessors, suggesting significant potential for the use of this technology as an educational aid for medical students. These findings underscore the need for careful consideration when integrating LLMs into medical education, emphasizing the importance of structured curricula to guide their appropriate use and the need for ongoing critical analyses to ensure their reliability and effectiveness., (© Brenton T Bicknell, Danner Butler, Sydney Whalen, James Ricks, Cory J Dixon, Abigail B Clark, Olivia Spaedy, Adam Skelton, Neel Edupuganti, Lance Dzubinski, Hudson Tate, Garrett Dyess, Brenessa Lindeman, Lisa Soleymani Lehmann. Originally published in JMIR Medical Education (https://mededu.jmir.org).)
Published: 2024
Full Text: View/download PDF

11. Performance of Chatgpt in ophthalmology exam; human versus AI.

Author: Balci AS, Yazar Z, Ozturk BT, and Altan C
Subjects: Humans, Cross-Sectional Studies, Education, Medical, Graduate methods, Ophthalmology education, Internship and Residency, Educational Measurement methods, Clinical Competence
Abstract: Purpose: This cross-sectional study focuses on evaluating the success rate of ChatGPT in answering questions from the 'Resident Training Development Exam' and comparing these results with the performance of the ophthalmology residents., Methods: The 75 exam questions, across nine sections and three difficulty levels, were presented to ChatGPT. The responses and explanations were recorded. The readability and complexity of the explanations were analyzed and The Flesch Reading Ease (FRE) score (0-100) was recorded using the program named Readable. Residents were categorized into four groups based on their seniority. The overall and seniority-specific success rates of the residents were compared separately with ChatGPT., Results: Out of 69 questions, ChatGPT answered 37 correctly (53.62%). The highest success was in Lens and Cataract (77.77%), and the lowest in Pediatric Ophthalmology and Strabismus (0.00%). Of 789 residents, overall accuracy was 50.37%. Seniority-specific accuracy rates were 43.49%, 51.30%, 54.91%, and 60.05% for 1st to 4th-year residents. ChatGPT ranked 292nd among residents. Difficulty-wise, 11 questions were easy, 44 moderate, and 14 difficult. ChatGPT's accuracy for each level was 63.63%, 54.54%, and 42.85%, respectively. The average FRE score of responses generated by ChatGPT was found to be 27.56 ± 12.40., Conclusion: ChatGPT correctly answered 53.6% of questions in an exam for residents. ChatGPT has a lower success rate on average than a 3rd year resident. The readability of responses provided by ChatGPT is low, and they are difficult to understand. As difficulty increases, ChatGPT's success decreases. Predictably, these results will change with more information loaded into ChatGPT., (© 2024. The Author(s), under exclusive licence to Springer Nature B.V.)
Published: 2024
Full Text: View/download PDF

12. Students' performance in clinical class II composite restorations: a case study using analytic rubrics.

Author: Daghrery A, Alwadai GS, Alamoudi NA, Alqahtani SA, Alshehri FH, Al Wadei MH, Abogazalah NN, Pereira GKR, and Al Moaleem MM
Subjects: Humans, Male, Female, Sex Factors, Students, Dental, Dental Restoration, Permanent, Education, Dental standards, Composite Resins, Educational Measurement methods, Clinical Competence standards
Abstract: Background: The analytical rubric serves as a permanent reference for guidelines on clinical performance for undergraduate dental students. This study aims to assess the rubric system used to evaluate clinical class II composite restorations performed by undergraduate dental students and to explore the impact of gender on overall student performance across two academic years. Additionally, we investigated the relationship between cumulative grade point averages (CGPAs) and students' clinical performance., Methods: An analytical rubric for the assessment of clinical class II composite restoration in the academic years of 2022/2023 and 2023/2024 was used by two evaluators. These two evaluators were trained to use the rubric before doing the evaluations. The scores were based on a 4-point scale for the evaluation of five major parameters for pre-operative procedures (10 points), cavity preparation (20 points), restoration procedures (20 points), and time management (4 points). At the same time, chairside oral exam parameter was 15 points based on a 5-point scale. Descriptive statistics were calculated for the different analytical rubric parameters, and the independent t-test was used to compare the scores between the student groups and the evaluators. Other tests, such as the Kappa test and Pearson's correlation coefficient, were used to measure the association among CGPA, evaluators, and gender participants., Results: The overall score out of 69 slightly increased for females/males (61.28/59.42) and (61.18/59.49) in the 2022/2023 and 2023/2024 academic years, respectively, but the differences were not statistically significant. In the 2022/2023 academic year, female students scored significantly higher than male students in pre-operative procedures, as evaluated by both evaluators (p = 0.001), and in time management, as assessed by both evaluators (p = 0.031). The Kappa test demonstrated a moderate to substantial level of agreement between the two evaluators in both academic years. Strong and significant correlations were noted between students' CGPA and some tested parameters (p = 0.000)., Conclusion: The overall performance was very good and high among both genders, but it was marginally higher among females than among males. This study found some differences in performance between male and female students and variability in the evaluations by the two raters ranging from moderate to substantial agreement and similar performances for students with different CGPA., (© 2024. The Author(s).)
Published: 2024
Full Text: View/download PDF

13. ChatGPT as a teaching tool: Preparing pathology residents for board examination with AI-generated digestive system pathology tests.

Author: Laohawetwanit T, Apornvirat S, and Kantasiripitak C
Subjects: Humans, Artificial Intelligence, Pathology education, Digestive System pathology, Clinical Competence, Pathology, Clinical education, Internship and Residency, Educational Measurement methods
Abstract: Objectives: To evaluate the effectiveness of ChatGPT 4 in generating multiple-choice questions (MCQs) with explanations for pathology board examinations, specifically for digestive system pathology., Methods: The customized ChatGPT 4 model was developed for MCQ and explanation generation. Expert pathologists evaluated content accuracy and relevance. These MCQs were then administered to pathology residents, followed by an analysis focusing on question difficulty, accuracy, item discrimination, and internal consistency., Results: The customized ChatGPT 4 generated 80 MCQs covering various gastrointestinal and hepatobiliary topics. While the MCQs demonstrated moderate to high agreement in evaluation parameters such as content accuracy, clinical relevance, and overall quality, there were issues in cognitive level and distractor quality. The explanations were generally acceptable. Involving 9 residents with a median experience of 1 year, the average score was 57.4 (71.8%). Pairwise comparisons revealed a significant difference in performance between each year group (P < .01). The test analysis showed moderate difficulty, effective item discrimination (index = 0.15), and good internal consistency (Cronbach's α = 0.74)., Conclusions: ChatGPT 4 demonstrated significant potential as a supplementary educational tool in medical education, especially in generating MCQs with explanations similar to those seen in board examinations. While artificial intelligence-generated content was of high quality, it necessitated refinement and expert review., (© The Author(s) 2024. Published by Oxford University Press on behalf of American Society for Clinical Pathology. All rights reserved. For commercial re-use, please contact reprints@oup.com for reprints and translation rights for reprints. All other permissions can be obtained through our RightsLink service via the Permissions link on the article page on our site—for further information please contact journals.permissions@oup.com.)
Published: 2024
Full Text: View/download PDF

14. Musculoskeletal Clinical Online Cases With a Focus on Anatomy for Preclinical Learners.

Author: Robertson K, McNulty MA, Natoli RM, Stout J, and Ulrich G
Subjects: Humans, Anatomy education, Internet, Education, Distance methods, Education, Medical, Undergraduate methods, Educational Measurement methods, Musculoskeletal Diseases, Curriculum, Students, Medical statistics & numerical data
Abstract: Introduction: While musculoskeletal disorders are leading causes of medical visits, musculoskeletal education is underrepresented in US medical curricula. Previous studies have demonstrated that undergraduate medical students often fail to demonstrate competency surrounding musculoskeletal disorders. More educational content is needed to support musculoskeletal knowledge in learners., Methods: We developed an online, case-based musculoskeletal module for second-year medical students alongside their standard course material and presented clinical cases with multiple-choice question quizzes regarding the presentation, diagnosis, and anatomic correlation of musculoskeletal conditions. Cases, under 10 minutes each, targeted common, medically important areas of musculoskeletal health., Results: Grades in the required musculoskeletal course were significantly and positively correlated with online module quiz performance. 258 (73%) of 354 students completed at least one quiz, and students completed an average of 14 out of 15 quizzes. Learners who completed more than 50% of the quizzes performed significantly better in the course than those who completed fewer quizzes; this was true for a formative internal course exam ( p = .035), an NBME customized assessment ( p = .008), and the course overall ( p = .021). Additional analyses of students' perceptions revealed that students valued the self-directed online learning environment. The high completion rate (73%) for the online module also signaled student value in the content and format., Discussion: This module represents educational material that has been demonstrated to improve medical student musculoskeletal learning. Additionally, the module could be expanded to address inadequacies in orthopedic education among other students, such as allied health learners., (© 2024 Robertson et al.)
Published: 2024
Full Text: View/download PDF

15. How Certification Exams Reflect Current Practice.

Author: Myers TL, DeGarmo S, and Horahan M
Subjects: Humans, Adult, Male, Female, Middle Aged, Educational Measurement standards, Educational Measurement methods, United States, Specialties, Nursing standards, Specialties, Nursing education, Certification standards, Clinical Competence standards, Education, Nursing, Continuing standards
Abstract: Certification exams are often the least understood aspect of certification processes. This article delves into the development of certification exams, highlighting their crucial role in assessing the knowledge and skills required for effective job performance. It explores how exams are meticulously crafted to align with specific job tasks, knowledge areas, and skill sets within various nursing specialties. Through collaborative efforts between practicing nurses and exam developers, these exams are designed to uphold validity, reliability, and fairness standards. Ultimately, certification exams are pivotal in evaluating nurses' competencies in vital aspects of clinical practice, facilitated by ongoing partnerships between nurses and certifying bodies. [ J Contin Educ Nurs. 2024;55(11):513-516.] .
Published: 2024
Full Text: View/download PDF

16. You Get What You Reward: A Qualitative Study Exploring Medical Student Engagement in 2 Different Assessment Systems.

Author: Jauregui J, McClintock AH, Schrepel C, Fainstad T, Bierer SB, and Heeneman S
Subjects: Humans, Cross-Sectional Studies, Clinical Clerkship, Male, Female, Reward, Grounded Theory, Clinical Competence, Students, Medical psychology, Qualitative Research, Educational Measurement methods, Education, Medical, Undergraduate methods
Abstract: Purpose: Educational impact is dependent on student engagement. Assessment design can provide a scaffold for student engagement to determine the focus of student efforts. Little is known about how medical students engage with assessment. Therefore, we asked the following research question: How do medical students engage with the process of assessment and their assessment data in 2 clinical assessment systems?, Method: This multi-institutional, cross-sectional constructivist grounded theory study of fourth-year undergraduate medical students at the University of Washington and Cleveland Clinic Lerner College of Medicine assessed 2 different assessment systems: traditional tiered grading, in which clerkship grades were summative, and programmatic assessment, in which students received low-stake, narrative feedback across clerkships with progress based on aggregated performance data in student portfolios. All fourth-year students were invited to participate in one-on-one semistructured interviews guided by student engagement theory between September 2022 and January 2023. Verbatim transcripts underwent iterative, qualitative analysis., Results: Twenty-two medical students were interviewed, 13 from a traditional grading assessment system and 9 from a programmatic assessment system. Three major ways in which assessment systems affected how students engaged with their assessments were categorized into the affective, cognitive, and behavioral domains of engagement: as a sociocultural statement of value, as the cognitive load associated with the assessment system and practices themselves, and as the locus of power and control in learning and authentic practice., Conclusions: Medical students' beliefs about assessment goals, cognitive burden of assessment, and relationships with others significantly affected their engagement with their assessments. In assessment systems that reward grading and an archetypal way of being, students report engaging by prioritizing image over learning. In programmatic assessment systems, students describe more fully and authentically engaging in their assessment for and as learning. Systems of assessment communicate what is rewarded, and you get what you reward., (Copyright © 2024 the Association of American Medical Colleges.)
Published: 2024
Full Text: View/download PDF

17. Constructing tests for skill assessment with competence-based test development.

Author: Anselmi P, Heller J, Stefanutti L, and Robusto E
Subjects: Humans, Psychometrics methods, Models, Statistical, Reproducibility of Results, Educational Measurement methods, Educational Measurement statistics & numerical data
Abstract: Competence-based test development is a recent and innovative method for the construction of tests that are as informative as possible about the competence state (the set of skills an individual has available) underlying the observed item responses. It finds application in different contexts, including the development of tests from scratch, and the improvement or shortening of existing tests. Given a fixed collection of competence states existing in a population of individuals and a fixed collection of competencies (each of which being the subset of skills that allow for solving an item), the competency deletion procedure results in tests that differ from each other in the competencies but are all equally informative about individuals' competence states. This work introduces a streamlined version of the competency deletion procedure that considers information necessary for test construction only, illustrates a straightforward way to incorporate test developer preferences about competencies into the test construction process, and evaluates the performance of the resulting tests in uncovering the competence states from the observed item responses., (© 2024 British Psychological Society.)
Published: 2024
Full Text: View/download PDF

18. Teaching and Assessing Pharmacy Students in Medication-Use Process Stewardship.

Author: Lebovitz L, Ives AL, and Brownlee SP
Subjects: Humans, Educational Measurement methods, Teaching, Pharmacists, Clinical Competence standards, Education, Pharmacy methods, Students, Pharmacy, Curriculum
Abstract: Objectives: The medication use system is a complex process of medication prescribing, order processing, dispensing, administration, and effects monitoring. The objectives of this review are to describe the available literature and identify resources for educating and assessing pharmacy students in COEPA (Curriculum Outcomes and Entrustable Professional Activities) 2.6 Medication-use Process Stewardship., Findings: In 2013, the Center for the Advancement of Pharmacy Education (CAPE) published educational outcomes, which included Medication Use Systems Management (CAPE 2.2, Manager). In 2022, the educational outcomes for pharmacy education were updated and integrated with entrustable professional activities as COEPA. During this evolution, the revised Medication-use Process Stewardship (COEPA 2.6, Steward) de-emphasized process management while focusing more on stewardship of the Pharmacists' Patient Care Process, person-centered care, optimizing patient outcomes, and the environmental impact of medication-use systems. A literature review identified 41 articles relevant to pharmacy education and assessment of medication-use concepts. Most available literature is aligned with CAPE 2013 domain 2.2 Manager, not COEPA 2.6 Steward. Many articles reported innovations in teaching and assessment, such as simulation in prescription verification and objective structured clinical examinations. Few articles reported on prescription verification and dispensing in noncommunity settings., Summary: The change from management to stewardship in COEPA 2.6 has significant curricular implications, with the emphasis moving from process- to person-centered care. However, continued integration of process-centered activities throughout the curriculum is essential to fully prepare graduates for entry-level practice. Future research is needed to identify approaches for teaching and assessing stewardship and the environmental impact of medication use systems., Competing Interests: Declaration of Competing Interest None declared., (Copyright © 2024 American Association of Colleges of Pharmacy. Published by Elsevier Inc. All rights reserved.)
Published: 2024
Full Text: View/download PDF

19. Suitability of GPT-4o as an evaluator of cardiopulmonary resuscitation skills examinations.

Author: Wang L, Mao Y, Wang L, Sun Y, Song J, and Zhang Y
Subjects: Humans, Reproducibility of Results, Male, Female, Video Recording, Cardiopulmonary Resuscitation education, Cardiopulmonary Resuscitation methods, Cardiopulmonary Resuscitation standards, Clinical Competence, Educational Measurement methods
Abstract: Aim: To assess the accuracy and reliability of GPT-4o for scoring examinees' performance on cardiopulmonary resuscitation (CPR) skills tests., Methods: This study included six experts certified to supervise the national medical licensing examination (three junior and three senior) who reviewed the CPR skills test videos across 103 examinees. All videos reviewed by the experts were subjected to automated assessment by GPT-4o. Both the experts and GPT-4o scored the videos across four sections: patient assessment, chest compressions, rescue breathing, and repeated operations. The experts subsequently rated GPT-4o's reliability on a 5-point Likert scale (1, completely unreliable; 5, completely reliable). GPT-4o's accuracy was evaluated using the intraclass correlation coefficient (for the first three sections) and Fleiss' Kappa (for the last section) to assess the agreement between its scores vs. those of the experts., Results: The mean accuracy scores for the patient assessment, chest compressions, rescue breathing, and repeated operation sections were 0.65, 0.58, 0.60, and 0.31, respectively, when comparing the GPT-4o's vs. junior experts' scores and 0.75, 0.65, 0.72, and 0.41, respectively, when comparing the GPT-4o's vs. senior experts' scores. For reliability, the median Likert scale scores were 4.00 (interquartile range [IQR] = 3.66-4.33, mean [standard deviation] = 3.95 [0.55]) and 4.33 (4.00-4.67, 4.29 [0.50]) for the junior and senior experts, respectively., Conclusions: GPT-4o demonstrated a level of accuracy that was similar to that of senior experts in examining CPR skills examination videos. The results demonstrate the potential for deploying this large language model in medical examination settings., Competing Interests: Declaration of competing interest The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper., (Copyright © 2024 The Author(s). Published by Elsevier B.V. All rights reserved.)
Published: 2024
Full Text: View/download PDF

20. Redesigned Entrustable Professional Activity (EPA) Assessments Reduce Grade Inflation in the Experiential Setting.

Author: Fuller K, Pinelli NR, and Persky AM
Subjects: Humans, Retrospective Studies, Competency-Based Education methods, Problem-Based Learning, Educational Measurement methods, Educational Measurement standards, Students, Pharmacy, Education, Pharmacy methods, Clinical Competence standards, Preceptorship
Abstract: Objective: This study aims to evaluate the impact of redesigning an entrustable professional activities (EPAs) assessment tool on the accuracy of student performance assessment within pharmacy education., Methods: The study used retrospective programmatic data for students on clinical rotations over a 3-year period and compared entrustment levels assigned by preceptors with suggested entrustment levels. This tool was redesigned to separate formative EPA feedback from final grade determination. Data were analyzed using chi-squared tests to identify trends in students ABOVE, AT, or BELOW the suggested entrustment levels. Additionally, to account for intercohort variability, the relationship between students ABOVE the suggested level of entrustment and postgraduate metrics was examined., Results: After the implementation of the revised tool, there was a significant decrease (-3%) in the percentage of students scoring ABOVE the suggested entrustment levels and an increase in the percentage of students scoring AT (+1%) or BELOW (+2%) the suggested entrustment levels. Changes were also observed in individual patient care settings, with a decrease in grade inflation and an increase in accurate assessments. North American Pharmacist Licensure Examination (NAPLEX) pass rates, residency match rates, and grade point average did not correlate with entrustment levels., Conclusion: The redesigned EPA assessment tool demonstrated a decrease in grade inflation resulting in more accurate assessments. The tool's focus on holistic grading and narrative descriptors contributed to better alignment between preceptor assessment and school-suggested achievement levels. This study suggests that EPA assessments in pharmacy education could benefit from a stronger emphasis on formative feedback and the use of holistic assessment methods for final grade determinations. The findings underscore the potential advantages of considering a separation between EPA scoring and final grades, prompting the Academy to explore their assessment practices to better reflect student performance in clinical experiences., Competing Interests: Declaration of Competing Interest None declared., (Copyright © 2024 American Association of Colleges of Pharmacy. Published by Elsevier Inc. All rights reserved.)
Published: 2024
Full Text: View/download PDF

21. Large Language Models and the North American Pharmacist Licensure Examination (NAPLEX) Practice Questions.

Author: Ehlert A, Ehlert B, Cao B, and Morbitzer K
Subjects: Humans, Language, North America, Licensure, Pharmacy, Educational Measurement standards, Educational Measurement methods, Pharmacists standards, Education, Pharmacy methods
Abstract: Objective: This study aims to test the accuracy of large language models (LLMs) in answering standardized pharmacy examination practice questions., Methods: The performance of 3 LLMs (generative pretrained transformer [GPT]-3.5, GPT-4, and Chatsonic) was evaluated on 2 independent North American Pharmacist Licensure Examination practice question sets sourced from McGraw Hill and RxPrep. These question sets were further classified into binary question categories of adverse drug reaction (ADR) questions, scenario questions, treatment questions, and select-all questions. Python was used to run χ 2 tests to compare model and question-type accuracy., Results: Of the 3 LLMs tested, GPT-4 achieved the highest accuracy, with 87% accuracy on the McGraw Hill question set and 83.5% accuracy on the RxPrep question set. In comparison, GPT-3.5 had 68.0% and 60.0% accuracy on those question sets, respectively, and Chatsonic had 60.5% and 62.5% accuracy on those question sets, respectively. All models performed worse on select-all questions compared with non-select-all questions (GPT-3: 42.3% vs 66.2%; GPT-4: 73.1 vs 87.2%; Chatsonic: 36.5% vs 71.6%). GPT-4 had statistically higher accuracy in answering ADR questions (96.1%) compared with non-ADR questions (83.9%)., Conclusion: Our study found that GPT-4 outperformed GPT-3.5 and Chatsonic in answering North American Pharmacist Licensure Examination pharmacy licensure examination practice questions, particularly excelling in answering questions related to ADRs. These results suggest that advanced LLMs such as GPT-4 could be used for applications in pharmacy education., Competing Interests: Declaration of Competing Interest A.E. reports a relationship with National Institute of Aging that includes: funding grants. The remaining authors declare no competing interests., (Copyright © 2024 American Association of Colleges of Pharmacy. Published by Elsevier Inc. All rights reserved.)
Published: 2024
Full Text: View/download PDF

22. Validation of Checklists and Evaluation of Clinical Skills in Cases of Abdominal Pain With Simulation in Formative, Objective, Structured Clinical Examination With Audiovisual Content in Third-Year Medical Students' Surgical Clerkship.

Author: Ruiz-Manzanera JJ, Almela-Baeza J, Aliaga A, Ádanez G, Alconchel F, Rodríguez JM, Sánchez-Bueno F, Ramírez P, and Febrero B
Subjects: Humans, General Surgery education, Education, Medical, Undergraduate methods, Simulation Training methods, Female, Educational Measurement methods, Male, Students, Medical statistics & numerical data, Patient Simulation, Checklist, Clinical Competence, Clinical Clerkship, Abdominal Pain diagnosis
Abstract: Objectives: The objective of this study was to develop and validate 6 checklists for evaluating abdominal pain in clinical simulation scenarios; to assess student competencies in managing 6 clinical cases using OSCE, based on faculty evaluations; and to analyze discrepancies between faculty and student evaluations., Design: A practical workshop was designed to address 6 clinical scenarios of abdominal surgical conditions. Four scenarios employed medium fidelity simulators, while 2 scenarios employed standardized patient methodology. Prior to the workshop, students received theoretical audiovisual material. At the conclusion of the workshop, students were evaluated using checklists that assessed communication, privacy, anamnesis, and technical skills. Ten workshops were conducted over 3 years, using the OSCE (Objective Structured Clinical Examination) format for evaluation., Setting: In the statistical analysis, t-Student tests or ANOVA were employed to ascertain whether there were any significant differences between the groups. In the process of validating checklists for clinical scenarios, 6 experts were asked to evaluate each item on a scale of 1 to 9. To assess the degree of agreement among experts, the intraclass correlation coefficient (ICC) was employed., Participants: The study involved a total of 670 third-year medical students from the University of Murcia (UMU), Spain, who participated in the subject "Medical-Surgical Skills.", Results: High levels of appropriateness were observed for the checklist items, with mean scores above 7.5 points, as well as high levels of inter-expert agreement. Students obtained a mean score of 8 points in the evaluation of each clinical scenario. No significant differences were found between faculty and student scores (p < 0.05)., Conclusions: The learning method focused on clinical scenarios of abdominal surgical diseases effectively enhanced the clinical skills of third-year medical students. It used pre-existing audiovisual materials, hands-on workshops with medium-fidelity simulators, and standardized patients. Consistent evaluations from students and faculty confirmed the efficacy of these strategies., (Copyright © 2024 The Author(s). Published by Elsevier Inc. All rights reserved.)
Published: 2024
Full Text: View/download PDF

23. Evaluating the Validity of National Multiassessment System in Postgraduate Surgical Training: A Retrospective Cohort Study.

Author: O'Keeffe DA, Traynor O, Tekian A, and Park YS
Subjects: Retrospective Studies, Humans, Reproducibility of Results, Female, Cohort Studies, Male, United States, Education, Medical, Graduate methods, General Surgery education, Internship and Residency, Educational Measurement methods, Clinical Competence
Abstract: Objective: The objective of this study was to evaluate the validity evidence supporting the use and interpretation of a multifaceted assessment system in the early years of surgical training., Design: This was a national retrospective cohort study analyzing the validity and reliability of an assessment process for surgical residents over a 2-year period. Data from all elements of the assessment process was evaluated using Messick's unified validity framework. Assessments were categorized as Workplace-based, Structured assessment performed in the academic center and Multiple Mini Interview., Setting: Our Institution is a health sciences university and the body responsible for the training and certification of all surgeons in our national program. Residents on the Core Surgical Training program undergo multiple assessments over the first 2 years of postgraduate training, both in the workplace and the academic training center, which inform their progression into higher surgical training in their chosen specialty., Participants: Data was collected from 2 cohorts of the entire population of postgraduate trainees nationally (N = 114)., Results: Best practice standards for educational testing aligned with the results supporting the use of this assessment process. Findings indicate a robust assessment system, demonstrating validity evidence in content, response process, internal structure, relations to other variables, and consequences. Composite score reliability of the assessment was 0.89 which demonstrates a highly reliable process. Correlation between workplace-based assessments and standardized tests performed in the simulation setting was also very high (0.93)., Conclusions: The Core Training Assessment System (CTAS) provides a psychometrically rigorous system to measure trainee competence during the initial years of training. Residency programs of all sizes can replicate the methods described here to demonstrate the validity of their assessment processes, thereby being able to stand over decisions on surgical competency., (Copyright © 2024 The Authors. Published by Elsevier Inc. All rights reserved.)
Published: 2024
Full Text: View/download PDF

24. Encouragement vs. liability: How prompt engineering influences ChatGPT-4's radiology exam performance.

Author: Nguyen D, MacKenzie A, and Kim YH
Subjects: Humans, Educational Measurement methods, Clinical Competence, United States, Education, Medical, Graduate, Radiology education, Internship and Residency
Abstract: Large Language Models (LLM) like ChatGPT-4 hold significant promise in medical application, especially in the field of radiology. While previous studies have shown the promise of ChatGTP-4 in textual-based scenarios, its performance on image-based response remains suboptimal. This study investigates the impact of prompt engineering on ChatGPT-4's accuracy on the 2022 American College of Radiology In Training Test Questions for Diagnostic Radiology Residents that include textual and visual-based questions. Four personas were created, each with unique prompts, and evaluated using ChatGPT-4. Results indicate that encouraging prompts and those disclaiming responsibility led to higher overall accuracy (number of questions answered correctly) compared to other personas. Personas that threaten the LLM with legal action or mounting clinical responsibility were not only found to score less, but also refrain of answering questions at a higher rate. These findings highlight the importance of prompt context in optimizing LLM responses and the need for further research to integrate AI responsibly into medical practice., Competing Interests: Declaration of competing interest The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper., (Copyright © 2024 Elsevier Inc. All rights reserved.)
Published: 2024
Full Text: View/download PDF

25. The Effect of Numeric Versus Pass/Fail USMLE Step 1 Scores in the Integrated Plastic Surgery 2023-2024 Match Cycle: A Single Institution Study.

Author: Godbe KN, Sinik LM, Nazir N, Egan K, Mathes D, Butterworth J, and Farmer R
Subjects: Humans, United States, Licensure, Medical, Female, Male, Education, Medical, Graduate methods, Adult, Surgery, Plastic education, Internship and Residency, Educational Measurement methods
Abstract: Objective: The USMLE Step 1 exam, an important metric in the integrated plastic surgery match, transitioned to pass/fail scoring in January 2022. No previous studies have investigated the impact of this new scoring system on the process of ranking applicants in the integrated plastic surgery match., Design: 330 Plastic Surgery Common Applications (PSCAs) were submitted to a single academic center in the 2023-2024 match cycle. Applicants were sorted into tiers via a holistic review process, and quantifiable data, including USMLE Step 1 scores, were then compared between tiers., Setting: Our Institution's Integrated Plastic Surgery Residency Program., Participants: Integrated Plastic Surgery applicants in the 2023-2024 match cycle., Results: 317 of 330 PSCAs were analyzed in this study, excluding applicants who did an elective rotation at our institution. Applicants were sorted into 3 tiers: high (n = 100), middle (n = 118) and low (n = 99), with a significant difference in match rate per tier, respectively (88.0%, 58.5%, 30.3%, p < 0.0001). The majority of USMLE Step 1 scores were reported as pass/fail (186/317, 58.7%). There was a significant difference (p < 0.0001) between the average USMLE Step 1 score between the high (mean 250.5, SD 10.4), middle (mean 241, SD 14.6), and low tiers (mean 235.5, SD 16.5). More applicants in the low tier (50%) and high tier (40%) reported numeric USMLE Step 1 scores than those in the middle tier (35%, p = 0.0734). Stepwise logistic regression revealed USMLE Step 1 score to be an independent predictor of tier placement between the high and middle tier (p = 0.0030) and high and low tier (p = 0.0001). Lastly, 3 applicants reported their USMLE Step 1 score as 'pass' instead of their given numeric score., Conclusions: Comparing applicants with numeric USMLE Step 1 scores to those with pass/fail scores can have a significant impact on the ranking of those applicants and should be carefully considered during the plastic surgery match process., (Copyright © 2024 The Authors. Published by Elsevier Inc. All rights reserved.)
Published: 2024
Full Text: View/download PDF

26. Artificial Intelligence in Orthopaedics: Performance of ChatGPT on Text and Image Questions on a Complete AAOS Orthopaedic In-Training Examination (OITE).

Author: Hayes DS, Foster BK, Makar G, Manzar S, Ozdag Y, Shultz M, Klena JC, and Grandizio LC
Subjects: Humans, Education, Medical, Graduate methods, Clinical Competence, Artificial Intelligence, Orthopedics education, Educational Measurement methods, Internship and Residency methods
Abstract: Objective: Artificial intelligence (AI) is capable of answering complex medical examination questions, offering the potential to revolutionize medical education and healthcare delivery. In this study we aimed to assess ChatGPT, a model that has demonstrated exceptional performance on standardized exams. Specifically, our focus was on evaluating ChatGPT's performance on the complete 2019 Orthopaedic In-Training Examination (OITE), including questions with an image component. Furthermore, we explored difference in performance when questions varied by text only or text with an associated image, including whether the image was described using AI or a trained orthopaedist., Design and Setting: Questions from the 2019 OITE were input into ChatGPT version 4.0 (GPT-4) using 3 response variants. As the capacity to input or interpret images is not publicly available in ChatGPT at the time of this study, questions with an image component were described and added to the OITE question using descriptions generated by Microsoft Azure AI Vision Studio or authors of the study., Results: ChatGPT performed equally on OITE questions with or without imaging components, with an average correct answer choice of 49% and 48% across all 3 input methods. Performance dropped by 6% when using image descriptions generated by AI. When using single answer multiple-choice input methods, ChatGPT performed nearly double the rate of random guessing, answering 49% of questions correctly. The performance of ChatGPT was worse than all resident classes on the 2019 exam, scoring 4% lower than PGY-1 residents., Discussion: ChatGT performed below all resident classes on the 2019 OITE. Performance on text only questions and questions with images was nearly equal if the image was described by a trained orthopaedic specialist but decreased when using an AI generated description. Recognizing the performance abilities of AI software may provide insight into the current and future applications of this technology into medical education., (Copyright © 2024 Association of Program Directors in Surgery. Published by Elsevier Inc. All rights reserved.)
Published: 2024
Full Text: View/download PDF

27. Surgery 360° Assessment Tool: Informing the Clinical Competency Committee with Nonfaculty Feedback on the "Soft Skills".

Author: Paterson CK, Anis O, Carter H, Augustine R, Josloff R, Kirton O, and Noonan KM
Subjects: Humans, Educational Measurement methods, Feedback, Education, Medical, Graduate methods, Internship and Residency, Clinical Competence, General Surgery education
Published: 2024
Full Text: View/download PDF

28. Extrapolative Validity Evidence of the Anastomosis Objective Structured Assessment of Technical Skill (A-OSATS) for Robotic Ileocolic Anastomosis.

Author: Brocke TK, Fox C, Clanahan JM, Klos CL, Chapman WC Jr, Wise PE, Awad MM, and Ohman KA
Subjects: Animals, Swine, Humans, Educational Measurement methods, General Surgery education, Education, Medical, Graduate methods, Simulation Training methods, Reproducibility of Results, Male, Female, Anastomosis, Surgical education, Clinical Competence, Internship and Residency methods, Ileum surgery, Colon surgery, Robotic Surgical Procedures education
Abstract: Objective: To collect validity evidence for the use of the Anastomosis Objective Structured Assessment of Technical Skills (A-OSATS) instrument, which has been developed to evaluate performance of a minimally invasive side-to-side bowel anastomosis with hand-sewn common enterotomy., Design: Residents performed a robotic ileocolic anastomosis simulation on an ex vivo porcine model. Faculty scored each resident with the A-OSATS and performed a provocative leak test on the completed anastomoses. Residents were reassessed on the sewing sub-score 1 month later. Data were compared with parametric and nonparametric analysis., Setting: Single academic general surgery residency PARTICIPANTS: PGY-4 and -5 general surgery residents (n = 17) RESULTS: PGY-5s performed better than PGY-4s in repeat A-OSATS sewing sub-score (mean 55/55 ± 0 vs 43 ± 4.9, p < 0.001) and time to complete (minutes, mean 14.5 ± 4.9 vs 21.2 ± 3.9, p = 0.01). There was a strong correlation between A-OSATS score and time (r = -0.67, p = 0.005). For the initial assessment, there was no significant difference in mean A-OSATS score between anastomoses that leaked and those that did not leak (137.3 ± 14.5 vs 150.1 ± 11.2, p = 0.098), but on repeat assessment, intact anastomoses had a higher mean A-OSATS sewing sub-score than those that leaked (52.2 ± 4.7 vs 39 ± 3.5, p = 0.007). There was no significant difference between initial A-OSATS score and repeat score (p = 0.14)., Conclusions: We provide extrapolative validity evidence for the A-OSATS instrument by comparing A-OSATS score to time to sew, provocative leak test, and discrimination between PGY-4s and PGY-5s. Generalizability validity evidence is provided by test-retest reliability. Further refinement is needed for the A-OSATS tool to be used for high-stakes entrustment decisions in resident-performed robotic ileocolic anastomoses., (Copyright © 2024 The Authors. Published by Elsevier Inc. All rights reserved.)
Published: 2024
Full Text: View/download PDF

29. Prompt engineering to increase GPT3.5's performance on the Plastic Surgery In-Service Exams.

Author: Nahass GR, Chin SW, Scharf IM, Kazmouz S, Kaplan N, Chiu R, Yang K, Bou Zeid N, Corcoran J, and Alkureishi LWT
Subjects: Humans, Internship and Residency, Surgery, Plastic, Educational Measurement methods, Clinical Competence
Abstract: This study assesses ChatGPT's (GPT-3.5) performance on the 2021 ASPS Plastic Surgery In-Service Examination using prompt modifications and Retrieval Augmented Generation (RAG). ChatGPT was instructed to act as a "resident," "attending," or "medical student," and RAG utilized a curated vector database for context. Results showed no significant improvement, with the "resident" prompt yielding the highest accuracy at 54%, and RAG failing to enhance performance, with accuracy remaining at 54.3%. Despite appropriate reasoning when correct, ChatGPT's overall performance fell in the 10th percentile, indicating the need for fine-tuning and more sophisticated approaches to improve AI's utility in complex medical tasks., (Copyright © 2024 British Association of Plastic, Reconstructive and Aesthetic Surgeons. Published by Elsevier Ltd. All rights reserved.)
Published: 2024
Full Text: View/download PDF

30. Patient Feedback Applied in Undergraduate Dental Education for Individual Student Development and Assessment: A Scoping Review.

Author: Bateman HL and McCracken GI
Subjects: Humans, Feedback, Patient Participation, Educational Measurement methods, Education, Dental methods, Students, Dental psychology
Abstract: Introduction: Active involvement of patients in healthcare professional education is well established, taking a variety of forms. There is a steer towards patient feedback informing the development of dental students and while there is recognition of its potential value to individual students, challenges exist related to collection and use. What is unclear is, within a dental education setting, the extent and use of patient feedback to individual students. A scoping review was conducted to assess and map the volume and characteristics of the research/literature in this area., Methods: Systematic searches of bibliographic databases Ovid MEDLINE(R), Scopus, ERIC and Embase were conducted, and wider literature (Google Scholar) was searched. Screening was conducted based on eligibility criteria and a customised data charting form was used., Results: The electronic and citation tracking searches identified 1021 studies. After duplicates were removed, 778 studies were screened by title and abstract, and 718 studies were found to be irrelevant to the current review. Sixty full-text studies were assessed for eligibility, 46 studies were excluded, and 14 studies were included for data charting., Conclusion: This review has identified that patient feedback has been captured through both simulated and real patient encounters. There was a bias towards feedback generated through simulated patient encounters. Feedback was reported to support the development of a range of skills, most frequently communication and patient management. Challenges that were identified by researchers related to staff/student engagement and available resources., (© 2024 The Author(s). European Journal of Dental Education published by John Wiley & Sons Ltd.)
Published: 2024
Full Text: View/download PDF

31. Using Performance Improvement Methods to Evaluate Processes for Writing Multiple-Choice Test Questions in the Postlicensure Clinical Environment: A Case Study.

Author: Davidson JE, Kalinowski A, Makhija H, Schneid SD, and Mandel J
Subjects: Humans, Male, Adult, Female, Middle Aged, Writing standards, Clinical Competence standards, Educational Measurement methods, Educational Measurement standards, Education, Nursing, Continuing
Abstract: Background: This article is the last of a four-part series to guide educators on the construction and evaluation of multiple-choice test items in the post-licensure environment. Previous articles in this series described the problem and the mechanics of test item construction and evaluation., Method: A replicable strategy for evaluating the organizational process for constructing multiple-choice test questions is provided. Steps taken to create change are described; work tools are provided., Results: Guidance and training are needed to create multiple-choice test questions. Many educators have not had training in item construction. Educators welcomed training. Personalized mentorship resulted in improvement. Asynchronous learning alone was helpful and well received and improved self-perceived knowledge, yet fell short of achieving competence., Conclusion: Voluntary training may not be adequate to assure enculturation of best practices without accountability standards and monitoring. Future research is indicated to assess the situation and provide national standards for adoption within health care organizations. [ J Contin Educ Nurs. 2024;55(11):535-542.] .
Published: 2024
Full Text: View/download PDF

32. Enhancing telehealth Objective Structured Clinical Examination fidelity with integrated Electronic Health Record simulation.

Author: Malhotra K, Beltran CP, Robak MJ, and Genes N
Subjects: Humans, Clinical Competence standards, Educational Measurement methods, Educational Measurement standards, Electronic Health Records, Telemedicine
Published: 2024
Full Text: View/download PDF

33. Longitudinal 'oral board exams' to maintain educational continuity.

Author: Buhl LK and Vengalil M
Subjects: Humans, Specialty Boards, Clinical Competence, Educational Measurement methods
Published: 2024
Full Text: View/download PDF

34. Alexa, write my exam: ChatGPT for MCQ creation.

Author: Schneid SD, Armour C, Evans S, and Brandl K
Subjects: Humans, Writing, Education, Medical, Educational Measurement methods
Published: 2024
Full Text: View/download PDF

35. Radiographical diagnostic competences of dental students using various feedback methods and integrating an artificial intelligence application-A randomized clinical trial.

Author: Rampf S, Gehrig H, Möltner A, Fischer MR, Schwendicke F, and Huth KC
Subjects: Humans, Periapical Periodontitis diagnostic imaging, Formative Feedback, Female, Male, Feedback, Radiography, Bitewing, Educational Measurement methods, Education, Dental methods, Clinical Competence, Students, Dental, Dental Caries diagnostic imaging, Dental Caries diagnosis, Artificial Intelligence, Radiography, Dental methods
Abstract: Introduction: Radiographic diagnostic competences are a primary focus of dental education. This study assessed two feedback methods to enhance learning outcomes and explored the feasibility of artificial intelligence (AI) to support education., Materials and Methods: Fourth-year dental students had access to 16 virtual radiological example cases for 8 weeks. They were randomly assigned to either elaborated feedback (eF) or knowledge of results feedback (KOR) based on expert consensus. Students´ diagnostic competences were tested on bitewing/periapical radiographs for detection of caries, apical periodontitis, accuracy for all radiological findings and image quality. We additionally assessed the accuracy of an AI system (dentalXrai Pro 3.0), where applicable. Data were analysed descriptively and using ROC analysis (accuracy, sensitivity, specificity, AUC). Groups were compared with Welch's t-test., Results: Among 55 students, the eF group by large performed significantly better than the KOR group in detecting enamel caries (accuracy 0.840 ± 0.041, p = .196; sensitivity 0.638 ± 0.204, p = .037; specificity 0.859 ± 0.050, p = .410; ROC AUC 0.748 ± 0.094, p = .020), apical periodontitis (accuracy 0.813 ± 0.095, p = .011; sensitivity 0.476 ± 0.230, p = .003; specificity 0.914 ± 0.108, p = .292; ROC AUC 0.695 ± 0.123, p = .001) and in assessing the image quality of periapical images (p = .031). No significant differences were observed for the other outcomes. The AI showed almost perfect diagnostic performance (enamel caries: accuracy 0.964, sensitivity 0.857, specificity 0.074; dentin caries: accuracy 0.988, sensitivity 0.941, specificity 1.0; overall: accuracy 0.976, sensitivity 0.958, specificity 0.983)., Conclusion: Elaborated feedback can improve student's radiographic diagnostic competences, particularly in detecting enamel caries and apical periodontitis. Using an AI may constitute an alternative to expert labelling of radiographs., (© 2024 The Author(s). European Journal of Dental Education published by John Wiley & Sons Ltd.)
Published: 2024
Full Text: View/download PDF

36. A two-step item bank calibration strategy based on 1-bit matrix completion for small-scale computerized adaptive testing.

Author: Shen Y, Wang S, and Xiao H
Subjects: Humans, Calibration, Models, Statistical, Sample Size, Data Interpretation, Statistical, Educational Measurement statistics & numerical data, Educational Measurement methods, Computer Simulation, Psychometrics statistics & numerical data
Abstract: Computerized adaptive testing (CAT) is a widely embraced approach for delivering personalized educational assessments, tailoring each test to the real-time performance of individual examinees. Despite its potential advantages, CAT�s application in small-scale assessments has been limited due to the complexities associated with calibrating the item bank using sparse response data and small sample sizes. This study addresses these challenges by developing a two-step item bank calibration strategy that leverages the 1-bit matrix completion method in conjunction with two distinct incomplete pretesting designs. We introduce two novel 1-bit matrix completion-based imputation methods specifically designed to tackle the issues associated with item calibration in the presence of sparse response data and limited sample sizes. To demonstrate the effectiveness of these approaches, we conduct a comparative assessment against several established item parameter estimation methods capable of handling missing data. This evaluation is carried out through two sets of simulation studies, each featuring different pretesting designs, item bank structures, and sample sizes. Furthermore, we illustrate the practical application of the methods investigated, using empirical data collected from small-scale assessments., (© 2024 The Authors. British Journal of Mathematical and Statistical Psychology published by John Wiley & Sons Ltd on behalf of British Psychological Society.)
Published: 2024
Full Text: View/download PDF

37. Evaluation of jigsaw collaborative learning strategy on students' learning of clinical pharmacokinetics of special populations.

Author: Chng HT, Ng HY, Teo Z, Liew SD, and Gan MJS
Subjects: Humans, Surveys and Questionnaires, Pharmacokinetics, Education, Pharmacy methods, Education, Pharmacy standards, Education, Pharmacy statistics & numerical data, Cooperative Behavior, Learning, Male, Curriculum standards, Curriculum trends, Female, Problem-Based Learning methods, Problem-Based Learning standards, Teaching standards, Teaching statistics & numerical data, Educational Measurement methods, Educational Measurement statistics & numerical data, Students, Pharmacy statistics & numerical data, Students, Pharmacy psychology
Abstract: Objectives: To evaluate the learning gain and students' perceptions towards Jigsaw collaborative learning in comparison with lectures in learning about pharmacokinetic changes in special populations., Methods: Undergraduates learn about A-D-M-E of specific populations via Jigsaw collaborative learning and didactic lectures. Pre- and post-lesson quizzes were conducted to evaluate the effectiveness of the teaching method in terms of knowledge gain. Surveys comprising Likert scale statements and open-ended questions were conducted to elucidate students' perception towards the teaching methods., Results: From a class of 192 students, 118 (62%) and 110 (57%) students completed the pre- and post-lecture quizzes, respectively, while 176 (92%) and 168 (88%) students completed the pre- and post-Home Group discussion of Jigsaw quizzes, respectively. There was an improvement of 22.2% and 14.3% in median percentage quiz scores for the lecture and Jigsaw method respectively. Most students agreed that they have learned (54-60%) and collaborated (78-89%) through the Jigsaw method and rated Jigsaw as useful for their learning (54%). Open-ended survey responses offered a mixed conclusion where the didactic lecture was perceived to be as, or more effective than the Jigsaw method., Conclusion: Learning gains were observed through the Jigsaw collaborative learning method which relied solely on peer-teaching, despite students perceiving it to be not as effective as lecture. The method provided opportunities for active and peer-learning. Further studies are needed to evaluate the long-term effects of this teaching method., Competing Interests: Declaration of competing interest The authors declare that there is no conflict of interest., (Copyright © 2024 Elsevier Inc. All rights reserved.)
Published: 2024
Full Text: View/download PDF

38. Comparison of peer, self, and faculty objective structured clinical examination evaluations in a PharmD nonprescription therapeutics course.

Author: Bowers RD, Baker CN, Becker KK, Hamilton JN, and Trotta K
Subjects: Humans, Students, Pharmacy statistics & numerical data, Students, Pharmacy psychology, Peer Group, Reproducibility of Results, Clinical Competence standards, Clinical Competence statistics & numerical data, Education, Pharmacy methods, Education, Pharmacy standards, Education, Pharmacy statistics & numerical data, Faculty, Pharmacy statistics & numerical data, Male, Self-Assessment, Female, Educational Measurement methods, Educational Measurement statistics & numerical data, Educational Measurement standards, Curriculum standards, Curriculum trends
Abstract: Purpose: Objective structured clinical examinations (OSCE) are a valuable assessment within healthcare education, as they provide the opportunity for students to demonstrate clinical competency, but can be resource intensive to provide faculty graders. The purpose of this study was to determine how overall OSCE scores compared between faculty, peer, and self-evaluations within a Doctor of Pharmacy (PharmD) curriculum., Methods: This study was conducted during the required nonprescription therapeutics course. Seventy-seven first-year PharmD students were included in the study, with 6 faculty members grading 10-15 students each. Students were evaluated by 3 graders: self, peer, and faculty. All evaluators utilized the same rubric. The primary endpoint of the study was to compare the overall scores between groups. Secondary endpoints included interrater reliability and quantification of feedback type based on the evaluator group., Results: The maximum possible score for the OSCE was 50 points; the mean scores for self, peer, and faculty evaluations were 43.3, 43.5, and 41.7 points, respectively. No statistically significant difference was found between the self and peer raters. However, statistical significance was found in the comparison of self versus faculty (p = 0.005) and in peer versus faculty (p < 0.001). When these scores were correlated to a letter grade (A, B, C or less), higher grades had greater similarity among raters compared to lower scores. Despite differences in scoring, the interrater reliability, or W score, on overall letter grade was 0.79, which is considered strong agreement., Conclusions: This study successfully demonstrated how peer and self-evaluation of an OSCE provides a comparable alternative to traditional faculty grading, especially in higher performing students. However, due to differences in overall grades, this strategy should be reserved for low-stakes assessments and basic skill evaluations., Competing Interests: Declaration of competing interest We declare no conflicts of interest or financial interests that the authors or members of their immediate families have in any product or service discussed in the manuscript, including grants (pending or received), employment, gifts, stock holdings or options, honoraria, consultancies, expert testimony, patents, and royalties., (Copyright © 2024. Published by Elsevier Inc.)
Published: 2024
Full Text: View/download PDF

39. When words are your scalpel, what and how information is exchanged may be differently salient to assessors.

Author: Li M, Kurahashi AM, Kawaguchi S, Siemens I, Sirianni G, and Myers J
Subjects: Humans, Grounded Theory, Clinical Competence standards, Educational Measurement methods
Abstract: Purpose: Variable assessments of learner performances can occur when different assessors determine different elements to be differently important or salient. How assessors determine the importance of performance elements has historically been thought to occur idiosyncratically and thus be amenable to assessor training interventions. More recently, a main source of variation found among assessors was two underlying factors that were differently emphasised: medical expertise and interpersonal skills. This gave legitimacy to the theory that different interpretations of the same performance may represent multiple truths. A faculty development activity introducing assessors to entrustable professional activities in which they estimated a learner's level of readiness for entrustment provided an opportunity to qualitatively explore assessor variation in the context of an interaction and in a setting in which interpersonal skills are highly valued., Methods: Using a constructivist grounded theory approach, we explored variation in assessment processes among a group of palliative medicine assessors who completed a simulated direct observation and assessment of the same learner interaction., Results: Despite identifying similar learner strengths and areas for improvement, the estimated level of readiness for entrustment varied substantially among assessors. Those who estimated the learner as not yet ready for entrustment seemed to prioritise what information was exchanged and viewed missed information as performance gaps. Those who estimated the learner as ready for entrustment seemed to prioritise how information was exchanged and viewed the same missed information as personal style differences or appropriate clinical judgement. When presented with a summary, assessors expressed surprise and concern about the variation., Conclusion: A main source of variation among our assessors was the differential salience of performance elements that align with medical expertise and interpersonal skills. These data support the theory that when assessing an interaction, differential salience for these two factors may be an important and perhaps inevitable source of assessor variation., (© 2024 The Author(s). Medical Education published by Association for the Study of Medical Education and John Wiley & Sons Ltd.)
Published: 2024
Full Text: View/download PDF

40. Finding significant indicators of PharmD academic performance to impact future students.

Author: Albuquerque EL, Acosta WR, and Lawson KA
Subjects: Humans, Surveys and Questionnaires, Texas, Male, Female, Adult, Education, Pharmacy methods, Education, Pharmacy standards, Education, Pharmacy statistics & numerical data, Education, Pharmacy trends, Educational Measurement methods, Educational Measurement statistics & numerical data, Curriculum trends, Curriculum standards, Education, Pharmacy, Graduate methods, Education, Pharmacy, Graduate standards, Education, Pharmacy, Graduate statistics & numerical data, Education, Pharmacy, Graduate trends, Students, Pharmacy statistics & numerical data, Students, Pharmacy psychology, Academic Performance statistics & numerical data, Academic Performance standards
Abstract: Objective: The purpose of this study is to identify which factors, both objective and subjective, from a student pharmacist's background are significantly related to academic performance in the professional PharmD program., Methods: Texas student pharmacists in their first three professional years during the 2022-2023 academic year were invited to participate in a 41-item survey to gather data on their undergraduate background, work experience, grit, and academic resilience. The survey responses were paired with the student pharmacist's cumulative grade point average (GPA) to assess the relationships between the variables and academic performance using Mann-Whitney U tests, Kruskal-Wallis tests, and Spearman's correlations., Results: Two hundred and fifty-one student pharmacists currently enrolled in a PharmD program in Texas responded to the survey invitation. Spearman's rho correlations showed weak positive and significant relationships between GPA and Grit scores as well as GPA and Resilience scores. Additionally, there is a moderate positive and significant relationship between student pharmacists' Grit and Resilience scores., Conclusion: The results suggest that assessing for Grit or Resilience as part of the admission process could aid in identifying future student pharmacists who would experience pharmacy school academic success. Integrating tools that develop Grit and Resilience in the PharmD curriculum could improve student pharmacists' academic performance., Competing Interests: Declaration of competing interest The authors have no financial disclosures or conflicts of interest to report for this study., (Copyright © 2024 Elsevier Inc. All rights reserved.)
Published: 2024
Full Text: View/download PDF

41. Blended learning compared to traditional learning for the acquisition of competencies in oral surgery by dental students: A randomized controlled trial.

Author: Blond N, Chaux AG, Hascoët E, Lesclous P, and Cloitre A
Subjects: Humans, Female, Male, Educational Measurement methods, Learning, Animals, Swine, Education, Dental methods, Clinical Competence, Students, Dental psychology, Surgery, Oral education
Abstract: Objective: To determine whether blended learning results in better educational outcomes compared to traditional learning in the acquisition of oral surgery technical skills for 4th-year undergraduate dental students., Materials and Methods: Seventy-three students participated in this two-arm parallel randomized controlled trial. Only students in the blended learning group had access to the online preparation platform for oral surgery practical work (PW) on a pig's jaw and to the debriefing. Kirkpatrick's four-level model was used to assess the educational outcomes directly after (levels 1 and 2) and 6 months later, after the start of the students' clinical activity (levels 3 and 4)., Results: For level 1, higher global satisfaction scores were found for students in the blended learning compared to the traditional learning group (p = .002). For level 2, blended learning resulted in an increase in knowledge score (p < .01), comparable to that observed in the traditional learning group. For level 3, students in the blended group made more progress in 6 months than those in the traditional group in terms of feeling able to assess and perform anaesthesia (p = .040) and surgical tooth extraction (p = .043). No difference in level 4 was found for the 6-month clinical surgical activity between groups, but students in the blended group felt more able to assess and perform the surgical management of a failed extraction requiring bone removal (p = .044)., Conclusion: Blended learning for oral surgery PW had a positive impact on three of the four Kirkpatrick levels (level 1, 3 and 4). Efforts should focus on the procedures that are perceived as the most difficult., (© 2024 The Author(s). European Journal of Dental Education published by John Wiley & Sons Ltd.)
Published: 2024
Full Text: View/download PDF

42. Impact of the modified curricula on periodontal instrumentation skills development during the COVID-19 pandemic from 2020 to 2023.

Author: Oh SL, Mishler O, Syme S, Jones D, and Saito H
Subjects: Humans, Male, Female, Pandemics, Simulation Training, Educational Measurement methods, COVID-19 prevention & control, COVID-19 epidemiology, Curriculum, Education, Dental methods, Clinical Competence, Periodontics education
Abstract: Objectives: This study aimed to evaluate the impact of curriculum modifications on periodontal instrumentation skills development among classes of 2021, 2022, and 2023 during the COVID-19 pandemic., Methods: The pandemic began and affected the three classes at different stages of their studies. Onsite simulation-based learning (SBL) was employed for the classes of 2021 and 2022; remote SBL was adopted for the class of 2023. Modified clinical education, due to social distancing guidelines, impacted the class of 2021 significantly and the class of 2022 to a lesser extent. A multiple linear regression model was built to examine the association between the fourth-year patient-based scaling competency examination scores and selected predictors., Results: The classes of 2021 and 2023 showed consistent performances over time, while the class of 2022 exhibited significant variation exhibiting the lowest performance at the second-year practical examination. While the clinical experience of the class of 2021 was significantly less than that of the classes of 2022 and 2023, the fourth-year competency examination scores did not differ across the three classes. The clinic points (p = 0.014) significantly affected the fourth-year competency examination score while student gender (p = 0.18), the first-year (p = 0.736), and second-year (p = 0.198) practical examination scores showed no correlations., Conclusion: Based on student performance in the fourth-year scaling competency examination, the curriculum modifications due to the COVID pandemic did not affect student learning outcomes. Clinical experience was the most influential determinant of skill development in periodontal instrumentation., (© 2024 American Dental Education Association.)
Published: 2024
Full Text: View/download PDF

43. Implementation of self-care scenario simulations in a skills-based first year doctor of pharmacy course for student application of the Pharmacists' Patient Care Process.

Author: Marshall LL, Hayslett RL, Brockington PS, and Momary K
Subjects: Humans, Surveys and Questionnaires, Educational Measurement methods, Educational Measurement statistics & numerical data, Patient Care methods, Patient Care standards, Patient Care psychology, Curriculum trends, Curriculum standards, Education, Pharmacy methods, Education, Pharmacy statistics & numerical data, Education, Pharmacy standards, Female, Male, Patient Simulation, Clinical Competence standards, Clinical Competence statistics & numerical data, Pharmacists psychology, Pharmacists statistics & numerical data, Adult, Students, Pharmacy statistics & numerical data, Students, Pharmacy psychology, Self Care methods, Self Care statistics & numerical data, Self Care psychology
Abstract: Background and Purpose: The objective of this project was to assess the impact of self-care scenario simulations on first year doctor of pharmacy student performance and self-perceived confidence in applying the Pharmacists' Patient Care Process (PPCP) during self-care encounters., Educational Activity and Settings: Self-care scenarios were developed and used during low fidelity simulations in laboratory sessions in a skills-based course. Students met individually with faculty facilitators role-playing patients to apply the PPCP in four simulations. Facilitators graded student performance; a comparison was made between performance on the first and fourth simulation. Students completed a pre- and post-course survey regarding their self-perceived confidence in performance and knowledge in applying the PPCP in self-care encounters., Findings: One hundred and eight (100%) of enrolled students voluntarily agreed to participate in this IRB-approved study. The median percentage of student scores on the fourth simulation, 90.7%, was higher compared to the median percentage of student scores on the first simulation, 82.4%, P < 0.001 with a raw difference of 8.3 percentage points, for participants with scores for both simulations, 106 (98%). For the self-perceived PPCP confidence survey, 100 (92.5%) participants completed both pre- and post-course surveys. Self-perceived confidence on 12 of the 15 survey items where students ranked their confidence in performance and knowledge in self-care encounters increased post- versus pre-course., Summary: Simulations served as a useful tool in improving student performance in applying the PPCP in self-care encounters in a first year doctor of pharmacy course. Student self-perceived confidence in performance and knowledge in self-care encounters also increased., Competing Interests: Declaration of competing interest The authors have no conflicts of interests or disclosures to declare., (Copyright © 2024 Elsevier Inc. All rights reserved.)
Published: 2024
Full Text: View/download PDF

44. Nurse Practitioner Certification Examination Development: From Reflecting Clinical Practice to Ensuring Lifelong Learning.

Author: Myers T, Chappell K, Godwin C, Krissel J, Ramirez J, and Smith J
Subjects: Humans, Clinical Competence standards, Educational Measurement standards, Educational Measurement methods, Accreditation standards, Education, Nursing, Continuing, Nurse Practitioners education, Nurse Practitioners standards, Certification standards, Psychiatric Nursing education, Psychiatric Nursing standards
Abstract: Objective: Certifications in psychiatric-mental health nursing promote safe practice by psychiatric-mental health nurse practitioners (PMHNPs) and nurses (PMHNs) and help protect the public from harm. This protection begins with the development of an examination that meets rigorous national education, practice, and accreditation standards and reflects PMHNPs' or PMHNs' clinical practice. Achievement and maintenance of a certification is a journey that involves a commitment to lifelong learning and the improvement of the field of psychiatric-mental health nursing through involvement in the examination process., Methods: This discussion paper outlines the role nurses can play in the development of certification examinations. It describes the process of developing an effective certification examination, including the role of standards, accrediting bodies, and content experts; determining necessary tasks, knowledge, and skills; surveying practitioners to validate information; writing test questions; and ongoing analysis of examination content. The Psychiatric-Mental Health Nurse Practitioner (across the lifespan) Certification (PMHNP-BC) is presented as an example of the process., Results: This discussion paper raises awareness of how certification exams are developed, PMHNPs participate in certification development, and volunteering promotes career development., Conclusion: The PMHNP-BC examination is based on education, practice, and certification accreditation standards and reflects current clinical practice. PMHNPs can (a) point to the rigor of certification as an indication of the quality of care they deliver, (b) volunteer to participate in the examination process to ensure examination rigor, and (c) advance their careers through the development and application of a valuable skill set., Competing Interests: Declaration of Conflicting InterestsThe authors declared the following potential conflicts of interest with respect to the research, authorship, and/or publication of this article: TM, KC, and JK are employed by the American Nurses Credentialing Center, which administers certification examinations. CG, JR, and JS are members of the Commission on Board Certification for the American Nurses Credentialing Center.
Published: 2024
Full Text: View/download PDF

45. Flipped classroom orthodontic education for undergraduate dental students: A factor analysis study.

Author: Saheb SAK, Hakami Z, Bokhari AM, and Bawazeer O
Subjects: Humans, Curriculum, Problem-Based Learning, Clinical Competence, Female, Male, Educational Measurement methods, Orthodontics education, Education, Dental methods, Students, Dental psychology
Abstract: Introduction: Dental schools have a primary responsibility to devise a curriculum that enhances students' confidence and knowledge in orthodontic case analysis. This study aims to compare the confidence levels and performance of undergraduate students in orthodontic case screening, moderated by faculty in a lecture-based format against their self-analysis of the same cases 1 year later, using a case-based and flipped learning approach., Materials and Methods: This study involved 100 fifth-year students. The same group received predoctoral orthodontics training through an instructor-centered, didactic approach in their fifth year and a case-based, student-centered, flipped classroom approach in their sixth year. At the end of each semester, the students completed an orthodontic case analysis and a self-reflection survey., Results: This study found no significant differences in diagnostic capabilities for orthodontic findings between the two methods studied. However, the self-evaluation survey data revealed an increase in students' confidence levels. This was specifically in terms of carrying out independent orthodontic case diagnosis, effectively communicating with orthodontic specialists, and their comfort in approaching orthodontic cases following the flipped classroom approach. Despite increased confidence in case diagnosis, the results showed that final-year students are uncertain about creating initial treatment plans and referring cases at an early stage., Conclusion: Despite no observed improvement in students' orthodontic diagnostic abilities after another semester of student-centered learning, their confidence in diagnosing orthodontic cases was notably enhanced., (© 2024 American Dental Education Association.)
Published: 2024
Full Text: View/download PDF

46. Clinical skills examination as part of the Swedish proficiency test of dentists educated outside of the EU/EEA.

Author: Dalum J, Christidis N, Häbel H, Karlgren K, Leanderson C, and Englund GS
Subjects: Humans, Sweden, Female, Male, Adult, Cohort Studies, European Union, Middle Aged, Clinical Competence, Educational Measurement methods, Education, Dental standards, Dentists
Abstract: Introduction: The increase in the migration of dentists educated outside the EU/EEA calls for the sharing of information and evaluation of recognition processes within countries in the EU. In 2017, the Swedish National Board of Health and Welfare implemented the Proficiency test, a recognition process for dentists who have completed an education programme outside the EU/EEA. The Proficiency test consists of a theoretical and an integrated clinical skills examination, followed by a 6-month clinical practice. The clinical skills examination is a two-part examination that includes an OSCE and an operative test on a dental manikin. This paper presents data from proficiency tests between 2018 and 2022, and explores factors related to grade fail, that is, demographics, theoretical exam scores and language comprehension., Materials and Methods: In a cohort study, demographics and factors associated with grade fail were explored using test results from theoretical and clinical skills examinations (n = 181) from 2018 to 2022. Pearson correlation coefficient and linear regression analysis were used for studying correlations and associations between exam results. Univariable linear and logistic regression models were used for background variable associations with clinical skills exam outcomes., Results: Higher age was a significant risk factor for failing the clinical skills examination and the OSCE. Higher scores in the theoretical exam reduced the odds of failing the OSCE but were not associated with results in the operative test or the overall results of the clinical skills examination. Regarding the OSCE there was a statistically significant difference within all professional qualifications explored between participants who passed and participants who failed the OSCE., Conclusions: Four years of data collection reveal that age and previous theoretical exam results influence the odds of failing the clinical examination. The study results also highlight the necessity of multiple assessment formats to assess clinical and communication skills of foreign-trained dentists., (© 2024 The Author(s). European Journal of Dental Education published by John Wiley & Sons Ltd.)
Published: 2024
Full Text: View/download PDF

47. The Accuracy of Artificial Intelligence ChatGPT in Oncology Examination Questions.

Author: Chow R, Hasan S, Zheng A, Gao C, Valdes G, Yu F, Chhabra A, Raman S, Choi JI, Lin H, and Simone CB 2nd
Subjects: Humans, Medical Oncology, Artificial Intelligence, Radiation Oncology education, Educational Measurement methods
Abstract: The aim of this study is to assess the accuracy of Chat Generative Pretrained Transformer (ChatGPT) in response to oncology examination questions in the setting of one-shot learning. Consecutive national radiation oncology in-service multiple-choice examinations were collected and inputted into ChatGPT 4o and ChatGPT 3.5 to determine ChatGPT's answers. ChatGPT's answers were then compared with the answer keys to determine whether ChatGPT correctly or incorrectly answered each question and to determine if improvements in responses were seen with the newer ChatGPT version. A total of 600 consecutive questions were inputted into ChatGPT. ChatGPT 4o answered 72.2% questions correctly, whereas 3.5 answered 53.8% questions correctly. There was a significant difference in performance by question category (P < .01). ChatGPT performed poorer with respect to knowledge of landmark studies and treatment recommendations and planning. ChatGPT is a promising technology, with the latest version showing marked improvement. Although it still has limitations, with further evolution, it may be considered a reliable resource for medical training and decision making in the oncology space., (Copyright © 2024 American College of Radiology. Published by Elsevier Inc. All rights reserved.)
Published: 2024
Full Text: View/download PDF

48. Inconsistencies in rater-based assessments mainly affect borderline candidates: but using simple heuristics might improve pass-fail decisions.

Author: Schauber SK, Olsen AO, Werner EL, and Magelssen M
Subjects: Humans, Judgment, Clinical Competence standards, Students, Medical, Education, Medical standards, Observer Variation, Heuristics, Educational Measurement standards, Educational Measurement methods, Decision Making
Abstract: Introduction: Research in various areas indicates that expert judgment can be highly inconsistent. However, expert judgment is indispensable in many contexts. In medical education, experts often function as examiners in rater-based assessments. Here, disagreement between examiners can have far-reaching consequences. The literature suggests that inconsistencies in ratings depend on the level of performance a to-be-evaluated candidate shows. This possibility has not been addressed deliberately and with appropriate statistical methods. By adopting the theoretical lens of ecological rationality, we evaluate if easily implementable strategies can enhance decision making in real-world assessment contexts., Methods: We address two objectives. First, we investigate the dependence of rater-consistency on performance levels. We recorded videos of mock-exams and had examiners (N=10) evaluate four students' performances and compare inconsistencies in performance ratings between examiner-pairs using a bootstrapping procedure. Our second objective is to provide an approach that aids decision making by implementing simple heuristics., Results: We found that discrepancies were largely a function of the level of performance the candidates showed. Lower performances were rated more inconsistently than excellent performances. Furthermore, our analyses indicated that the use of simple heuristics might improve decisions in examiner pairs., Discussion: Inconsistencies in performance judgments continue to be a matter of concern, and we provide empirical evidence for them to be related to candidate performance. We discuss implications for research and the advantages of adopting the perspective of ecological rationality. We point to directions both for further research and for development of assessment practices., (© 2024. The Author(s).)
Published: 2024
Full Text: View/download PDF

49. Development and implementation of an online formulary exercise for fifth-year pharmacy students during an experiential hospital practice training.

Author: Iwasawa M, Kasugai K, Sugawara M, and Otori K
Subjects: Humans, Surveys and Questionnaires, Education, Pharmacy methods, Education, Pharmacy standards, Education, Pharmacy statistics & numerical data, Problem-Based Learning methods, Pharmacy Service, Hospital methods, Formularies as Topic, Curriculum trends, Curriculum standards, Educational Measurement methods, Educational Measurement statistics & numerical data, Students, Pharmacy statistics & numerical data, Students, Pharmacy psychology
Abstract: Background and Purpose: Formulary systems play a crucial role in healthcare organizations by promoting collaboration and ensuring the rational and cost-effective utilization of medications. With a rise in pharmacist involvement in hospital formulary management, this study aims to describe the components of an online formulary exercise, assess fifth-year students' perceptions of this exercise, and evaluate its effectiveness in understanding formulary management and the pharmacist's role., Educational Activity and Setting: The online formulary exercise was initiated during hospital practice training at Kitasato University Hospital since October 2021. Students underwent reading assignments and a pre-test before participating in the program. The one-day program included a pre-practice test, 1.5 h of pre-recorded video lectures, 2.5 h of two small group discussions, a 1-h individual assignment creating a proton pump inhibitor comparison chart, 30 min of group presentations, and feedback from clinical faculty. Post-program assessments comprised a test, evaluations, and surveys on difficulty, necessity, and impressions. Analysis involved descriptive methods and thematic analysis for free-form responses, and a Friedman test for test scores., Findings: The surveys conducted between July 2022 and February 2023 were compiled and analyzed. This study assessed the impact of an online formulary exercise program on 100 participants, revealing an improvement in formulary understanding (97%) and a high recommendation rate (92%). Test performance demonstrated an improvement (p < 0.05, r = 0.85), with students recognizing the importance of contributing to the reduction of healthcare costs. The program positively influenced students' formulary knowledge and readiness for pharmacist roles., Summary: This online formulary exercise provided a valuable opportunity for students to learn about formulary management. The use of survey results and test scores demonstrated the positive impact of both pre-assignments and exercise on students' comprehension of formulary, enhancing not only their understanding but also fostering a sense of responsibility as future pharmacists., Competing Interests: Declaration of competing interest The authors declare no conflict of interest. There are no financial conflicts of interest to disclose. This study did not receive any funding., (Copyright © 2023. Published by Elsevier Inc.)
Published: 2024
Full Text: View/download PDF

50. Improving preparation for pharmacy entry-to-practice OSCE using a participatory action research.

Author: Huneault C, Haeberli P, Mühle A, Laurent P, and Berger J
Subjects: Humans, Surveys and Questionnaires, Switzerland, Health Services Research, Clinical Competence standards, Clinical Competence statistics & numerical data, Curriculum trends, Curriculum standards, Students, Pharmacy statistics & numerical data, Students, Pharmacy psychology, Educational Measurement methods, Educational Measurement statistics & numerical data, Education, Pharmacy methods, Education, Pharmacy standards, Education, Pharmacy statistics & numerical data, Education, Pharmacy trends
Abstract: Introduction: In Switzerland, becoming a licensed pharmacist requires succeeding a federal entry-to-practice exam that includes an Objective Structured Clinical Examination (OSCE). Candidates from the University of Geneva (UNIGE) exhibited a higher failure rate in this part of the examination in comparison to candidates from other Swiss institutions. The institution made a specific set of pedagogical changes to a 3-week pharmacy services course that is run during their Master's second year to prepare them for their entry-to-practice OSCE. One key change was a switch from a summative in-classroom OSCE to an on-line formative OSCE., Methods: New teaching activities were introduced between 2019 2020 and 2021-2022 academic years to help students strengthen their patient-facing skills and prepare for the federal OSCE. These online activities consisted in formative OSCEs supplemented with group and individual debriefings and in 18 h clinical case simulations reproducing OSCE requirements and assessed with standardized evaluation grids. Failure rates before and after the introduction of these activities were compared, and their perceived usefulness by UNIGE candidates was collected through a questionnaire survey., Results: The UNIGE failure rate decreased from 6.8% in 2018/2019 to 3.3% in 2022 following the implementation of the new teaching activities. The difference in failure rates between UNIGE and the other institutions became less pronounced in 2022 compared to 2018/2019. The redesigned Master's course was highlighted as useful for preparation, with all new activities perceived as beneficial. Questionnaire responses brought attention to challenges faced by UNIGE candidates, including stress management, insufficient information or practical training, and experiences related to quarantine. These insights informed further development of teaching methods., Discussion: Although the results do not establish a direct link between participation in new teaching activities and increased performance, they suggest resolving the initial issue. Our findings relate to pedagogical concepts such as constructive alignment, formative assessment and examination anxiety, and generally support the benefits of online format., Conclusion: This study used a participatory action research based on mixed methods to address a challenge in pharmacy education. Online teaching activities including formative OSCEs, case simulations and debriefings were implemented. Improved performance in entry-to-practice OSCE was subsequently observed. The results highlight the potential of formative, active, and constructively aligned online activities, such as role-playing and case simulation, to enhance patient-facing skills and improve outcomes in summative assessments of these skills., Competing Interests: Declaration of competing interest No author has any conflict of interest to declare. No funding source had any involvement in conducting the research or in preparing the article., (Copyright © 2024 The Authors. Published by Elsevier Inc. All rights reserved.)
Published: 2024
Full Text: View/download PDF

Catalog

Books, media, physical & digital resources

See catalog results

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Language

Publication Type

Journal

Region

Database

Publisher

11,368 results on '"Educational Measurement methods"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources