1,701 results on '"IRT"'
Search Results
2. Linking Unlinkable Tests: A Step Forward.
- Author
-
Testa, Silvia and Miceli, Renato
- Abstract
Random Equating (RE) and Heuristic Approach (HA) are two linking procedures that may be used to compare the scores of individuals in two tests that measure the same latent trait, in conditions where there are no common items or individuals. In this study, RE—that may only be used when the individuals taking the two tests come from the same population—was used as a benchmark for evaluating HA, which, in contrast, does not require any distributional assumptions. The comparison was based on both simulated and empirical data. Simulations showed that HA was good at reproducing the link shift connecting the difficulty parameters of the two sets of items, performing similarly to RE under the condition of slight violation of the distributional assumption. Empirical results showed satisfactory correspondence between the estimates of item and person parameters obtained via the two procedures. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
3. How does item wording affect participants' responses in Likert scale? Evidence from IRT analysis.
- Author
-
Biao Zeng, Minjeong Jeon, and Hongbo Wen
- Subjects
DEGREES of freedom ,FACTOR analysis ,MODELS & modelmaking ,RESEARCH personnel ,COLLEGE students - Abstract
Researchers often combine both positively and negatively worded items when constructing Likert scales. This combination, however, may introduce method effects due to the variances in item wording. Although previous studies have tried to quantify these effects by using factor analysis on scales with different content, the impact of varied item wording on participants' choices among specific options remains unexplored. To address this gap, we utilized four versions of the Undergraduate Learning Burnout (ULB) scale, each characterized by a unique valence of item wording. After collecting responses from 1,131 college students, we employed unidimensional, multidimensional, and bi-factor Graded Response Models for analysis. The results suggested that the ULB scale supports a unidimensional structure for the learning burnout trait. However, the inclusion of different valences of wording within items introduced additional method factors, explaining a considerable degree of variance. Notably, positively worded items demonstrated greater discriminative power and more effectively counteracted the biased outcomes associated with negatively worded items, especially between the "Strongly Disagree" and "Disagree" options. While there were no substantial differences in the overall learning burnout traits among respondents of different scale versions, slight variations were noted in their distributions. The integration of both positive and negative wordings reduced the reliability of the learning burnout trait measurement. Consequently, it is recommended to use exclusively positively worded items and avoid a mix in item wording during scale construction. If a combination is essential, the bi-factor IRT model might help segregate the method effects resulting from the wording valence. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
4. Patient feasibility as a novel approach for integrating IRT and LCA statistical models into patient-centric qualitative data--a pilot study.
- Author
-
Klüglich, Matthias, Santy, Bert, Tanev, Mihail, Hristov, Kristian, and Mincheva, Tsveta
- Subjects
STATISTICAL models ,DATA analysis ,QUALITATIVE research ,CAUSAL models ,GLIOMAS ,PILOT projects ,CONTENT analysis ,LOGISTIC regression analysis ,EMPIRICAL research ,STRUCTURAL equation modeling ,CANCER patients ,DESCRIPTIVE statistics ,PATIENT-centered care ,SURVEYS ,THEMATIC analysis ,PSYCHOMETRICS ,RESEARCH ,CONCEPTUAL structures ,HEALTH outcome assessment ,DATA analysis software - Abstract
Introduction: Clinical research increasingly recognizes the role and value of patient-centric data incorporation in trial design, aiming for more relevant, feasible, and engaging studies for participating patients. Despite recognition, research on analytical models regarding qualitative patient data analysis has been insufficient. Aim: This pilot study aims to explore and demonstrate the analytical framework of the "patient feasibility" concept--a novel approach for integrating patientcentric data into clinical trial design using psychometric latent class analysis (LCA) and interval response theory (IRT) models. Methods: A qualitative survey was designed to capture the diverse experiences and attitudes of patients in an oncological indication. Results were subjected to content analysis and categorization as a preparatory phase of the study. The analytical phase further employed LCA and hybrid IRT models to discern distinct patient subgroups and characteristics related to patient feasibility. Results: LCA identified three latent classes each with distinct characteristics pertaining to a latent trait defined as patient feasibility. Covariate analyses further highlighted subgroup behaviors. In addition, IRT analyses using the two-parameter logistic model, generalized partial credit model, and nominal response model highlighted further distinct characteristics of the studied group. The results provided insights into perceived treatment challenges, logistic challenges, and limiting factors regarding the standard of care therapy and clinical trial attitudes. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
5. The choice between cognitive diagnosis and item response theory: A case study from medical education.
- Author
-
Lim, Youn Seon and Bangeranye, Catherine
- Subjects
- *
ITEM response theory , *STUDENT attitudes , *EDUCATIONAL tests & measurements , *ASSESSMENT of education , *PSYCHOMETRICS - Abstract
Feedback is a powerful instructional tool for motivating learning. But effective feedback, requires that instructors have accurate information about their students' current knowledge status and their learning progress. In modern educational measurement, two major theoretical perspectives on student ability and proficiency can be distinguished. Latent trait models identify ability as a continuous uni- or multi-dimensional construct, with unidimensional item response theoretic (IRT) models presumably the most popular type of latent trait models. They report a single ability score that allows for locating examinees relative to their peers on the latent ability dimension targeted by the test. Latent trait models have been criticized for lacking diagnostic information on students' specific skills, their strengths and weaknesses in a knowledge domain. Cognitive diagnosis (CD) models, in contrast, describe ability as a combination of discrete skills (called "attributes") that constitute (partially) ordered latent classes of proficiency. The focus of CD is on collecting information about the learning progress for immediate feedback to students in terms of skills they have mastered and those needing study. CD has been underused in education; performance assessment still mostly relies on latent-trait-based methods. The motivation for the study reported here arose from the desire to conduct a side-by-side evaluation of the two seemingly disparate psychometric frameworks, CD and IRT. Data from a biochemistry end-of-term exam were used for illustration. They were fitted with multiple CD and IRT models, among them also HO-GDINA models that permit for a close approximation to several unidimensional IRT models. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
6. Multi-Technique Approach for the Sustainable Characterisation and the Digital Documentation of Painted Surfaces in the Hypogeum Environment of the Priscilla Catacombs in Rome.
- Author
-
Calicchia, Paola, Ceccarelli, Sofia, Colao, Francesco, D'Erme, Chiara, Di Tullio, Valeria, Guarneri, Massimiliano, Luvidi, Loredana, Proietti, Noemi, Spizzichino, Valeria, Zampelli, Margherita, and Zito, Rocco
- Abstract
The purpose of this paper is to identify an efficient, sustainable, and "green" approach to address the challenges of the preservation of hypogeum heritage, focusing on the problem of moisture, a recurring cause of degradation in porous materials, especially in catacombs. Conventional and novel technologies have been used to address this issue with a completely non-destructive approach. The article provides a multidisciplinary investigation making use of advanced technologies and analysis to quantify the extent and distribution of water infiltration in masonry before damage starts to be visible or irreversibly causes damage. Four different technologies, namely Portable Nuclear Magnetic Resonance (NMR), Audio Frequency–Acoustic Imaging (AF–AI), Laser-Induced Fluorescence (LIF), Infrared Thermography (IRT), and 3D Laser Scanning (RGB-ITR), were applied in the Priscilla catacombs in Rome (Italy). These imaging techniques allow the characterisation of the deterioration of painted surfaces within the delicate environment of the Greek chapel in the Priscilla catacombs. The resulting high-detailed 3D coloured model allowed for easily referencing the data collected by the other techniques aimed also at the study of the potential presence of salt efflorescence and/or microorganisms. The results supply an efficient and sustainable tool aimed at cultural heritage conservation but also at the creation of digital documentation obtained with green methodologies for a wider sharing, ensuring its preservation for future generations. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
7. The structure of knowledge about the concept of derivative – a study investigating a process-object framework.
- Author
-
Litteck, Kristin, Rolfes, Tobias, and Heinze, Aiso
- Subjects
- *
TEST validity , *PRIOR learning , *GRADING of students , *ACQUISITION of data , *STUDENTS - Abstract
Different process-object frameworks have been used to describe the structure and sequence of acquisition of knowledge about mathematical concepts in the sense that operational (i.e. process-related) knowledge should be acquired before structural (i.e. object-related) knowledge. This approach has previously been used to investigate the structure of knowledge about the concept of derivative from a qualitative perspective. In this article, we apply a process-object framework to investigate the acquisition of knowledge about the concept of derivative quantitatively. Data was collected from
N = 176 grade 10/11 students in Germany using a test instrument that has been specifically developed to measure operational and structural knowledge about the concept of derivative. In a second study, interviews were conducted with a selected subsample of the same students to ensure the validity of the test and the compatibility of our items to the theoretical model. Our results show that knowledge about the concept of derivative can be seen as psychometrically two-dimensional reflecting operational and structural aspects of knowledge. Additionally, our results indicate that a majority of students might only have acquired pseudostructural knowledge about the concept of derivative. Lastly, implications for research specifically for the role of prior knowledge are discussed. [ABSTRACT FROM AUTHOR]- Published
- 2024
- Full Text
- View/download PDF
8. Item response theory for before-after designs in interprofessional education research.
- Author
-
Kerry, Matthew J., Reinders, Jan J., Krijnen, Wim P., and Huber, Marion
- Subjects
- *
ITEM response theory , *CLASSICAL test theory , *PROFESSIONAL identity , *EDUCATION research , *DESIGN education , *INTERPROFESSIONAL education - Abstract
Although Item Response Theory (IRT) has been recommended for helping advance interprofessional education (IPE) research, its use remains limited. This may be partly explained by potential misconceptions regarding IRT`s “limitation” to cross-sectional data. The aim of this study is to demonstrate how Item Response Theory (IRT) can be applied effectively in before-and-after designs in IPE research. Specifically, a two-week before-after design with survey methodology using the Extended Professional Identity Scale (EPIS), an interprofessional identity measure, was conducted among
n = 146 mixed health-science students. Results indicated that EPIS increased significantly before-after intervention by .74 standardised mean differences,t 146 = 7.73,p < .05. The before-after IRT model also gave a test–retest reliability estimate of .60 which was considered acceptable. Comparison of the IRT model with a conventional paired-t-test indicated similar effect size estimates of Cohen’sd = .56 and .54, respectively. We demonstrate IRT`s flexibility to before-after studies in IPE. Application of this model can yield accurate changes in target IPE constructs, and it is advantageous to classical test theory vis-à-vis baseline differences. [ABSTRACT FROM AUTHOR]- Published
- 2024
- Full Text
- View/download PDF
9. An Item Response Theory Analysis of the Clinician-Administered PTSD Scale for DSM-5 Among Veterans.
- Author
-
Lee, Daniel J., Crowe, Michael L., Weathers, Frank W., Bovin, Michelle J., Ellickson, Stephanie, Sloan, Denise M., Schnurr, Paula, Keane, Terence M., and Marx, Brian P.
- Subjects
- *
DIAGNOSIS of post-traumatic stress disorder , *POST-traumatic stress disorder , *SELF-injurious behavior , *DISABILITIES , *RESEARCH funding , *RESEARCH methodology evaluation , *CLASSIFICATION of mental disorders , *ATTITUDES toward disabilities , *VETERANS , *RESEARCH methodology , *PSYCHOMETRICS , *AMNESIA , *PATIENTS' attitudes , *SENSITIVITY & specificity (Statistics) - Abstract
We used item response theory (IRT) analysis to examine Clinician-Administered PTSD Scale for DSM-5 (CAPS-5) item performance using data from three large samples of veterans (total N = 808) using both binary and ordinal rating methods. Relative to binary ratings, ordinal ratings provided good coverage from well below to well above average within each symptom cluster. However, coverage varied by cluster, and item difficulties were unevenly distributed within each cluster, with numerous instances of redundancy. For both binary and ordinal scores, flashbacks, dissociative amnesia, and self-destructive behavior items showed a pattern of high difficulty but relatively poor discrimination. Results indicate that CAPS-5 ordinal ratings provide good severity coverage and that most items accurately differentiated between participants by severity. Observed uneven distribution and redundancy in item difficulty suggest there is opportunity to create an abbreviated version of the CAPS-5 for determining PTSD symptom severity, but not DSM-5 PTSD diagnosis, without sacrificing precision. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
10. FACTORS INFLUENCING THE USE OF QRIS IN DIGITAL TRANSACTIONS.
- Author
-
Muhammad, Fadhil, Suroso, Arif Imam, and Djohar, Setiadi
- Subjects
INNOVATION adoption ,TECHNOLOGY assessment ,TWO-dimensional bar codes ,PAYMENT ,DIGITAL technology - Abstract
Background: QRIS as a payment method through QR codes has emerged as one of the products initiated by the implementation of the Indonesian Payment System Blueprint (SPI). Purpose: The purpose of QRIS is to expand the efficient acceptance of non-cash payments, focusing on traditional mass economic functions (such as traditional markets and MSMEs). Design/methodology/approach: The analysis in this research utilized Partial Least Square Equation Modeling (PLS-SEM). Findings/Result: The results indicate that habit and hedonic motivation variables have a significant positive influence on behavioral intention, while usage barriers have a significant negative impact on behavioral intention. Furthermore, habit and facilitating conditions variables have a significant positive impact on use behavior. Conclusion: The findings highlight that the decision to use QRIS is largely habitual, influenced by prior experiences and quick decision-making (Hedonic Motivation) at the point of transaction. The decision-making process is no longer lengthy, as QR-based payment systems were already familiar before QRIS implementation in Indonesia. Originality/value (State of the art): The approach of simultaneously measuring acceptance factors and barriers is still rarely employed in technology adoption studies. Evaluating both factors concurrently can provide a more comprehensive assessment of the technology adoption process. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
11. Can GAD-7 be used reliably to capture anxiety?: approaching evaluation of item quality using IRT.
- Author
-
Omarsdottir, Hilma Ros, Vésteinsdóttir, Vaka, Asgeirsdottir, Ragnhildur Lilja, Kristjansdottir, Hafrun, and Thorsdottir, Fanney
- Abstract
Screening tools play an important role in treatment of anxiety and are used both to identify and monitor symptoms of the disorder. In research they are often used to measure efficacy of treatment. Reliability of these screening tools is therefore highly important. The most prominent screening tool for anxiety today is the GAD-7. The aim of the current study was to evaluate the GAD-7, using methods of IRT, on a clinical sample of mental health patients. The sample consisted of 226 individuals that had sought help for anxiety and/or depression symptoms. Results indicted four main issues with the scale for the clinical population, a) reliability was contingent on anxiety level, b) items 5, 6 and 7 contribute minimal information to the total measure, c) the summation score is not intuitive and d) the response categories are flawed. It is concluded, that for further use, the scale needs some revisions. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
12. Validation and psychometric evaluation of the Short Warwick-Edinburgh Mental Well-Being Scale (SWEMWBS) among Czech adolescents using Item Response Theory
- Author
-
Radka Hanzlová and Aleš Kudrnáč
- Subjects
SWEMWBS ,IRT ,Validation ,Psychometric analysis ,Mental well-being ,Czechia ,Computer applications to medicine. Medical informatics ,R858-859.7 - Abstract
Abstract Background The topic of adolescent mental health is currently a subject of much debate due to the increasing prevalence of mental health problems among this age group. Therefore, it is crucial to have high-quality and validated mental well-being measurement tools. While such tools do exist, they are often not tailored specifically to adolescents and are not available in Czech language. The aim of this study is to validate and test the Czech version of the Short Warwick-Edinburgh Mental Well-Being Scale (SWEMWBS) on a large sample of Czech adolescents aged 15 to 18 years. Methods The analysis is based on data from the first wave of the Czech Education Panel Survey (CZEPS) and was mainly conducted using Item Response Theory (IRT), which is the most appropriate method for this type of analysis. Specifically, the Graded Response Model (GRM) was applied to the data. This comprehensive validation study also included reliability and three types of validity (construct, convergent and criterion) testing. Results The study found that the Czech version of the SWEMWBS for adolescents aged 15 to 18 years (N = 22,498) has good quality and psychometric properties. The data was analysed using the GRM model as it met the assumptions for the use of IRT. The estimated parameter values by GRM demonstrated good discriminant and informative power for all items, except for item 7, which showed poorer results compared to the others. However, excluding it from the scale would not enhance the overall quality of the scale. The five-category response scale functions effectively. Additionally, the results demonstrated high reliability, and all types of validity tested were also confirmed. Conclusions The Czech version of the SWEMWBS for adolescents has been validated as a psychometrically sound, reliable and valid instrument for measuring mental well-being. It can therefore be used with confidence in future studies.
- Published
- 2024
- Full Text
- View/download PDF
13. Validation and psychometric evaluation of the Short Warwick-Edinburgh Mental Well-Being Scale (SWEMWBS) among Czech adolescents using Item Response Theory.
- Author
-
Hanzlová, Radka and Kudrnáč, Aleš
- Subjects
- *
ITEM response theory , *MENTAL illness , *CZECHS , *CZECH language , *PSYCHOMETRICS - Abstract
Background: The topic of adolescent mental health is currently a subject of much debate due to the increasing prevalence of mental health problems among this age group. Therefore, it is crucial to have high-quality and validated mental well-being measurement tools. While such tools do exist, they are often not tailored specifically to adolescents and are not available in Czech language. The aim of this study is to validate and test the Czech version of the Short Warwick-Edinburgh Mental Well-Being Scale (SWEMWBS) on a large sample of Czech adolescents aged 15 to 18 years. Methods: The analysis is based on data from the first wave of the Czech Education Panel Survey (CZEPS) and was mainly conducted using Item Response Theory (IRT), which is the most appropriate method for this type of analysis. Specifically, the Graded Response Model (GRM) was applied to the data. This comprehensive validation study also included reliability and three types of validity (construct, convergent and criterion) testing. Results: The study found that the Czech version of the SWEMWBS for adolescents aged 15 to 18 years (N = 22,498) has good quality and psychometric properties. The data was analysed using the GRM model as it met the assumptions for the use of IRT. The estimated parameter values by GRM demonstrated good discriminant and informative power for all items, except for item 7, which showed poorer results compared to the others. However, excluding it from the scale would not enhance the overall quality of the scale. The five-category response scale functions effectively. Additionally, the results demonstrated high reliability, and all types of validity tested were also confirmed. Conclusions: The Czech version of the SWEMWBS for adolescents has been validated as a psychometrically sound, reliable and valid instrument for measuring mental well-being. It can therefore be used with confidence in future studies. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
14. Extending the PROMIS item bank "ability to participate in social roles and activities": a psychometric evaluation using IRT.
- Author
-
Williams, Guido L., Flens, Gerard, Terwee, Caroline B., de Beurs, Edwin, Spinhoven, Philip, and Paap, Muirne C. S.
- Subjects
- *
DUTCH people , *PSYCHOMETRICS , *SOCIAL skills , *SOCIAL role , *SAMPLING methods - Abstract
Objective: Our objective was to explore whether the extension of the PROMIS item bank Ability to Participate in Social Roles and Activities (APSRA) with new items would result in more effective targeting (i.e., selecting items that are appropriate for each individual's trait level), and more reliable measurements across all latent trait levels. Methods: A sample of 1,022 Dutch adults completed all 35 items of the original item bank plus 17 new items (in Dutch). The new items presented in this publication have been translated provisionally from Dutch into English for presentation purposes. We evaluated the basic IRT assumptions unidimensionality, local independence, and monotonicity. Furthermore, we examined the item parameters, and assessed differential item functioning (DIF) for sex, education, region, age, and ethnicity. In addition, we compared the test information functions, item parameters, and θ scores, for the original and extended item bank in order to assess whether the measurement range had improved. Results: We found that the extended item bank was compatible with the basic IRT assumptions and showed good reliability. Moreover, the extended item bank improved the measurement in the lower trait range, which is important for reliably assessing functioning in clinical populations (i.e., persons reporting lower levels of participation). Conclusion: We extended the PROMIS-APSRA item bank and improved its psychometric quality. Our study contributes to PROMIS measurement innovation, which allows for the addition of new items to existing item banks, without changing the interpretation of the scores and while maintaining the comparability of the scores with other PROMIS instruments. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
15. Item Parameter Recovery: Sensitivity to Prior Distribution.
- Author
-
DeMars, Christine E. and Satkus, Paulius
- Subjects
- *
STATISTICAL models , *DATA analysis , *DIFFERENTIAL item functioning (Research bias) , *PROBABILITY theory , *RESEARCH methodology evaluation , *EDUCATIONAL tests & measurements , *CONVALESCENCE , *PSYCHOMETRICS , *STATISTICS , *SENSITIVITY & specificity (Statistics) - Abstract
Marginal maximum likelihood, a common estimation method for item response theory models, is not inherently a Bayesian procedure. However, due to estimation difficulties, Bayesian priors are often applied to the likelihood when estimating 3PL models, especially with small samples. Little focus has been placed on choosing the priors for marginal maximum estimation. In this study, using sample sizes of 1,000 or smaller, not using priors often led to extreme, implausible parameter estimates. Applying prior distributions to the c -parameters alleviated the estimation problems with samples of 500 or more; for the samples of 100, priors on both the a -parameters and c -parameters were needed. Estimates were biased when the mode of the prior did not match the true parameter value, but the degree of the bias did not depend on the strength of the prior unless it was extremely informative. The root mean squared error (RMSE) of the a -parameters and b -parameters did not depend greatly on either the mode or the strength of the prior unless it was extremely informative. The RMSE of the c -parameters, like the bias, depended on the mode of the prior for c. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
16. Assessing health‐related quality of life using the Wound‐QoL‐17 and the Wound‐QoL‐14—Results of the cross‐sectional European HAQOL study using item response theory.
- Author
-
Janke, Toni Maria, Kozon, Vlastimil, Valiukeviciene, Skaidra, Rackauskaite, Laura, Reich, Adam, Stępień, Katarzyna, Chernyshov, Pavel, Jankechova, Monika, van Montfrans, Catherine, Amesz, Stella, Barysch, Marjam, Conde Montero, Elena, Augustin, Matthias, Blome, Christine, and Braren‐von Stülpnagel, Catharina C.
- Subjects
DIFFERENTIAL item functioning (Research bias) ,WORRY ,RESEARCH funding ,QUESTIONNAIRES ,AGE distribution ,POPULATION geography ,LEISURE ,QUALITY of life ,PSYCHOMETRICS ,SLEEP ,PAIN ,SOCIAL skills ,THEORY ,CHRONIC wounds & injuries ,ACTIVITIES of daily living - Abstract
For assessing health‐related quality of life in patients with chronic wounds, the Wound‐QoL questionnaire has been developed. Two different versions exist: the Wound‐QoL‐17 and the Wound‐QoL‐14. For international and cross‐cultural comparisons, it is necessary to demonstrate psychometric properties in an international study. Therefore, the aim of this study was to test both questionnaires in a European sample, using item response theory (IRT). Participants were recruited in eight European countries. Item characteristic curves (ICC), item information curves (IIC) and differential item functioning (DIF) were calculated. In both questionnaires, ICCs for most items were well‐ordered and sufficiently distinct. For items, in which adjacent response categories were not sufficiently distinct, response options were merged. IICs showed that items on sleep and on pain, on worries as well as on day‐to‐day and leisure activities had considerably high informational value. In the Wound‐QoL‐14, the item on social activities showed DIFs regarding the country and age. The same applied for the Wound‐QoL‐17, in which also the item on stairs showed DIFs regarding age. Our study showed comparable results across both versions of the Wound‐QoL. We established a new scoring method, which could be applied in international research projects. For clinical practice, the original scoring can be maintained. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
17. IRTrees for skipping items in PIRLS.
- Author
-
Christiansen, Andrés and Janssen, Rianne
- Subjects
ACHIEVEMENT tests ,ITEM response theory - Abstract
In international large-scale assessments, students may not be compelled to answer every test item: a student can decide to skip a seemingly difficult item or may drop out before the end of the test is reached. The way these missing responses are treated will affect the estimation of the item difficulty and student ability, and ultimately affect the country's score. In the Progress in International Reading Literacy Study (PIRLS), incorrect answer substitution is used. This means that skipped and omitted items are treated as incorrect responses. In the present study, the effect of this approach is investigated. The data of 2006, 2011, and 2016 cycles of PIRLS were analyzed using IRTree models in which a sequential tree structure is estimated to model the full response process. Item difficulty, students' ability, and country means were estimated and compared with results from a Rasch model using the standard PIRLS approach to missing values. Results showed that the IRTree model was able to disentangle the students' ability and their propensity to skip items, reducing the correlation between ability and the proportion of skipped items in comparison to the Rasch model. Nevertheless, at the country level, the aggregated scores showed no important differences between models for the pooled sample, but some differences within countries across cycles. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
18. Harmonizing the CBCL and SDQ ADHD scores by using linear equating, kernel equating, item response theory and machine learning methods.
- Author
-
Jović, Miljan, Haeri, Maryam Amir, Whitehouse, Andrew, and van den Berg, Stéphanie M.
- Subjects
ITEM response theory ,MACHINE learning ,MACHINE theory ,CHILD Behavior Checklist ,ATTENTION-deficit hyperactivity disorder ,DATA harmonization ,PRESCHOOL children - Abstract
Introduction: A problem that applied researchers and practitioners often face is the fact that different institutions within research consortia use different scales to evaluate the same construct which makes comparison of the results and pooling challenging. In order to meaningfully pool and compare the scores, the scales should be harmonized. The aim of this paper is to use different test equating methods to harmonize the ADHD scores from Child Behavior Checklist (CBCL) and Strengths and Difficulties Questionnaire (SDQ) and to see which method leads to the result. Methods: Sample consists of 1551 parent reports of children aged 10-11.5 years from Raine study on both CBCL and SDQ (common persons design). We used linear equating, kernel equating, Item Response Theory (IRT), and the following machine learning methods: regression (linear and ordinal), random forest (regression and classification) and Support Vector Machine (regression and classification). Efficacy of the methods is operationalized in terms of the root-mean-square error (RMSE) of differences between predicted and observed scores in cross-validation. Results and discussion: Results showed that with single group design, it is the best to use the methods that use item level information and that treat the outcome as interval measurement level (regression approach). [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
19. Patient feasibility as a novel approach for integrating IRT and LCA statistical models into patient-centric qualitative data—a pilot study
- Author
-
Matthias Klüglich, Bert Santy, Mihail Tanev, Kristian Hristov, and Tsveta Mincheva
- Subjects
IRT ,latent traits ,LCA ,patient-centric data ,patient feasibility ,Medicine ,Public aspects of medicine ,RA1-1270 ,Electronic computers. Computer science ,QA75.5-76.95 - Abstract
IntroductionClinical research increasingly recognizes the role and value of patient-centric data incorporation in trial design, aiming for more relevant, feasible, and engaging studies for participating patients. Despite recognition, research on analytical models regarding qualitative patient data analysis has been insufficient.AimThis pilot study aims to explore and demonstrate the analytical framework of the “patient feasibility” concept—a novel approach for integrating patient-centric data into clinical trial design using psychometric latent class analysis (LCA) and interval response theory (IRT) models.MethodsA qualitative survey was designed to capture the diverse experiences and attitudes of patients in an oncological indication. Results were subjected to content analysis and categorization as a preparatory phase of the study. The analytical phase further employed LCA and hybrid IRT models to discern distinct patient subgroups and characteristics related to patient feasibility.ResultsLCA identified three latent classes each with distinct characteristics pertaining to a latent trait defined as patient feasibility. Covariate analyses further highlighted subgroup behaviors. In addition, IRT analyses using the two-parameter logistic model, generalized partial credit model, and nominal response model highlighted further distinct characteristics of the studied group. The results provided insights into perceived treatment challenges, logistic challenges, and limiting factors regarding the standard of care therapy and clinical trial attitudes.
- Published
- 2024
- Full Text
- View/download PDF
20. Item Response Theory in Sample Reweighting to Build Fairer Classifiers
- Author
-
Minatel, Diego, dos Santos, Nícolas Roque, da Silva, Vinícius Ferreira, Cúri, Mariana, de Andrade Lopes, Alneu, Filipe, Joaquim, Editorial Board Member, Ghosh, Ashish, Editorial Board Member, Zhou, Lizhu, Editorial Board Member, Lossio-Ventura, Juan Antonio, editor, Ceh-Varela, Eduardo, editor, Vargas-Solar, Genoveva, editor, Marcacini, Ricardo, editor, Tadonki, Claude, editor, Calvo, Hiram, editor, and Alatrista-Salas, Hugo, editor
- Published
- 2024
- Full Text
- View/download PDF
21. The Deconstruction of Measurement Invariance (and DIF)
- Author
-
Yousfi, Safir, Wiberg, Marie, Kim, Jee-Seon, Hwang, Heungsun, editor, Wu, Hao, editor, and Sweet, Tracy, editor
- Published
- 2024
- Full Text
- View/download PDF
22. Designing AI-Based Non-invasive Method for Automatic Detection of Bovine Mastitis
- Author
-
Lakshitha, S. L., Sajja, Priti Srinivas, Filipe, Joaquim, Editorial Board Member, Ghosh, Ashish, Editorial Board Member, Prates, Raquel Oliveira, Editorial Board Member, Zhou, Lizhu, Editorial Board Member, Patel, Kanubhai K., editor, Santosh, KC, editor, and Patel, Atul, editor
- Published
- 2024
- Full Text
- View/download PDF
23. Neither agree nor disagree: use and misuse of the neutral response category in Likert-type scales
- Author
-
Kankaraš, Miloš and Capecchi, Stefania
- Published
- 2024
- Full Text
- View/download PDF
24. XGBoost To Enhance Learner Performance Prediction
- Author
-
Soukaina Hakkal and Ayoub Ait Lahcen
- Subjects
Learner performance prediction ,Empirical comparison ,IRT ,PFA ,DAS3H ,XGBoost ,Electronic computers. Computer science ,QA75.5-76.95 - Abstract
The huge amount of data generated by an Intelligent Tutoring System becomes useful when analyzed in an appropriate way to provide significant insights about learners, especially his or her performance. Performance data retrieved from historical interactions is the main engine for learner performance prediction, where the likelihood of the learner answering correctly future questions is calculated. Modeling learner performance can provide significant insights into individual students to promote successful learning and maximize educational achievement. This study aims to enhance the learner performance prediction of some logistic regression-based models, namely Item Response Theory, Performance Factor Analysis, and DAS3H using XGBoost, including an empirical comparison of eight real-world datasets, containing performance log data collected from different online intelligent tutoring systems, involving the first time a new dataset from Moodle Morocco. The results have demonstrated that the XGBoost has enhanced PFA predictive performance on seven datasets with an AUC of up 0.88 and improved the DAS3H AUC on the ASSISTment17 dataset while conserving almost the same predictive results for Item Response Theory on some datasets.
- Published
- 2024
- Full Text
- View/download PDF
25. Optimizing Item Construction in Diagnostic Mathematics Test.
- Author
-
Hartono, Wahyu, Hadi, Samsul, and Rosnawati, Raden
- Subjects
- *
JUNIOR high school students , *ITEM response theory , *RATIONAL numbers , *EDUCATIONAL evaluation , *AKAIKE information criterion - Abstract
The diagnostic mathematics test is a critical tool for measuring students' abilities to understand and apply mathematical concepts, with the design of good test items being paramount to ensure validity. This study leverages Item Response Theory (IRT) models and Differential Item Functioning (DIF) methods to refine the construction of test items, specifically focusing on rational numbers. Engaging 929 junior high school students from three public schools in Cirebon, West Java The research utilized R Software to analyze the most suitable IRT models and investigate DIF methods. The findings underscore the efficacy of the Parameter Logistic 3PL model based on Akaike Information Criterion (AIC), Bayesian Information Criterion (BIC), -2 loglikelihood, and Standardized Root Mean Square Residual (SRMSR) values, alongside item fit, highlighting that nearly all analyzed items were suitable except one that required replacement. Additionally, the identification of items with significant DIF effects points to potential biases, suggesting avenues for enhancing test fairness and reliability. The study's broader implications extend to improving diagnostic assessment practices, informing item design in educational evaluations, and guiding future research towards creating more equitable and precise measures of mathematical understanding. This contributes to a nuanced comprehension of student abilities, offering valuable insights for educators, assessment designers, and policymakers aimed at fostering improved learning outcomes in mathematics education. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
26. Global trends and performances of infrared imaging technology studies on acupuncture: a bibliometric analysis.
- Author
-
Yuanyuan Feng, Yunfan Xia, Binke Fan, Shimin Li, Zuyong Zhang, and Jianqiao Fang
- Subjects
INFRARED technology ,INFRARED imaging ,BIBLIOMETRICS ,ACUPUNCTURE ,NEAR infrared spectroscopy - Abstract
Objectives: To summarize development processes and research hotspots of infrared imaging technology research on acupuncture and to provide new insights for researchers in future studies. Methods: Publications regarding infrared imaging technology in acupuncture from 2008 to 2023 were downloaded from the Web of Science Core Collection (WoSCC). VOSviewer 1.6.19, CiteSpace 6.2.R4, Scimago Graphica, and Microsoft Excel software were used for bibliometric analyses. The main analyses include collaboration analyses between countries, institutions, authors, and journals, as well as analyses on keywords and references. Results: A total of 346 publications were retrieved from 2008 to 2023. The quantity of yearly publications increased steadily, with some fluctuations over the past 15 years. "Evidence-Based Complementary and Alternative Medicine" and "American Journal of Chinese Medicine" were the top-cited journals in frequency and centrality. China has the largest number of publications, with the Shanghai University of Traditional Chinese Medicine being the most prolific institution. Among authors, Litscher Gerhard from Austria (currently Swiss University of Traditional Chinese Medicine, Switzerland) in Europe, was the most published and most cited author. The article published by Rojas RF was the most discussed among the cited references. Common keywords included "Acupuncture," "Near infrared spectroscopy," and "Temperature," among others. Explore the relationship between acupoints and temperature through infrared thermography technology (IRT), evaluate pain objectively by functional near-infrared spectroscopy (fNIRS), and explore acupuncture for functional connectivity between brain regions were the hotspots and frontier trends in this field. Conclusion: This study is the first to use bibliometric methods to explore the hotspots and cutting-edge issues in the application of infrared imaging technology in the field of acupuncture. It offers a fresh perspective on infrared imaging technology research on acupuncture and gives scholars useful data to determine the field's hotspots, present state of affairs, and frontier trends. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
27. Psychometric properties and item response theory analysis of the Persian version of the social pain questionnaire.
- Author
-
Sepehrinia, Mahya, Farahani, Hojjatollah, Watson, Peter, and Amini, Nasim
- Subjects
PSYCHOMETRICS ,CRONBACH'S alpha ,EXPLORATORY factor analysis ,CONFIRMATORY factor analysis ,TEST validity ,ITEM response theory - Abstract
Introduction: Social pain is an emotional reaction which is triggered by social exclusion and has been extensively investigated in the literature. The Social Pain Questionnaire (SPQ) is a self-report instrument which is the only scale for measuring social pain as a dispositional factor. The current study aimed at examining the psychometric properties of the SPQ in an Iranian sample. Materials and methods: A sample of participants (N = 400) was recruited in a cross-sectional validation study. Exploratory Factor Analysis (EFA) as well as Confirmatory Factor Analysis (CFA) were conducted. The Item Response Theory (IRT) model parameters were evaluated and item response category curves were presented. Convergent and divergent validities as well as the reliability (by using Cronbach's alpha coefficient) were also assessed. Results: The SPQ's unidimensionality was affirmed (RMSEA = 0.078; CFI = 0.915; TLI = 0.99) and its internal consistency was robust (Cronbach's a = 0.94). The correlation between the SPQ and the following measures endorsed its divergent and convergent validity: Self-esteem (r = -0.424), Perceived Social Support (r = -0.161), and Interpersonal Sensitivity (r = 0.636). Finally, Item Response Theory Analysis emphasized the effectiveness of the SPQ items in discerning various levels of social pain. The theta level ranged between -1 and + 1.2 and the IRT-based marginal reliability was 0.92 for the total score. Discussion: The Persian SPQ stands as a reliable and valid measure for evaluating social pain. This scale has the potential to stimulate further research in the field for both clinical and non-clinical settings. Conclusion: By employing Item Response Theory (IRT) analysis, we have transcended the theoretical psychometric evaluation of the SPQ scale and demonstrated that SPQ is a unidimensional, valid and reliable measurement tool. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
28. Infrared Thermography in Assessment of Facial Temperature of Racing Sighthound-Type Dogs in Different Environmental Conditions.
- Author
-
Budny-Walczak, Anna, Wilk, Martyna, and Kupczyński, Robert
- Subjects
- *
THERMOGRAPHY , *GREYHOUNDS , *DOG racing , *THERMAL imaging cameras , *HUNTING dogs , *GREYHOUND racing , *WORKING dogs - Abstract
Simple Summary: Greyhound welfare science, with a particular emphasis on the heat stress associated with greyhound racing, which often takes place during the summer, is an area of growing interest. Hyperthermia in sporting and working dogs is one of the greatest threats to their health due to physical exercise, often in suboptimal environmental conditions. Monitoring the health of these dogs must be a priority, and for this purpose, it is worth using non-invasive methods that reduce additional stress. One such tool is infrared thermography (IRT). The aim of the study was to assess the usefulness of IRT measurements of selected regions of interest (ROI), i.e., the eyeball and the nose of whippet dogs, before and after coursing competitions taking place in various environmental conditions, thereby enabling the assessment of well-being and the level of heat stress. The research was carried out over two different periods with different thermal humidity indexes (THIs). In the first period, the THI was 59.27 (Run 1), while in the second period, the THI was 63.77 (Run 2). The experimental subjects comprised 111 sighthound-type dogs—whippets—that were photographed with a thermal imaging camera to determine their eye temperature (ET) and nose temperature (NT). The average minimum and maximum eye temperatures were statistically lower after running in both measurements. Increased minimum and maximum nose temperatures were also demonstrated after both runs. The nasal temperature values were statistically higher for Run 2, for which the THI was higher, compared to Run 1. Eyeball temperature may be a marker of thermoregulation ability, regardless of the ambient temperature. The value of ETmax decreased on average by 2.23 °C and 0.4 °C, while NTmax increased uniformly by 2 °C after both runs. A correlation was found between the IRT measurements and physiological indicators. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
29. Development of standard computerised adaptive test (CAT) settings for the EORTC CAT Core.
- Author
-
Petersen, Morten Aa., Vachon, Hugo, Giesinger, Johannes M., and Groenvold, Mogens
- Subjects
- *
ADAPTIVE testing , *PATIENT reported outcome measures , *CATS , *PRECISION farming , *OPTIMAL stopping (Mathematical statistics) - Abstract
Aims: Computerised adaptive test (CAT) provides individualised patient reported outcome measurement while retaining direct comparability of scores across patients and studies. Optimal CAT measurement requires an appropriate CAT-setting, the set of criteria defining the CAT including start item, item selection criterion, and stop criterion. The European Organisation for Research and Treatment of Cancer (EORTC) CAT Core allows for assessing the 14 functional and symptom domains covered by the EORTC QLQ-C30 questionnaire. The aim was to present a general approach for selecting CAT-settings and to use this to develop a portfolio of standard settings for the EORTC CAT Core optimised for different purposes and populations. Methods: Using simulations, the measurement properties of CATs of different length and precision were evaluated and compared allowing for identifying the most suitable settings. All CATs were initiated with the most informative QLQ-C30 item. For each domain two fixed-length and two fixed-precision standard CATs were selected focusing on efficiency (brief version) and precision (long), respectively. Results: The brief fixed-length CATs included 3–5 items each while the long versions included 5–8 items. The fixed-precision CATs aimed for reliability of 0.65–0.95 (brief versions) and 0.85–0.98 (long versions), respectively. Median sample size savings using the CATs compared to the QLQ-C30 scales ranged 20%-31%, although savings varied considerably across the domains. Conclusion: The EORTC CAT Core standard settings simplify selection of relevant and appropriate CATs. The CATs prioritise either brevity and efficiency or precision, but all provide increased measurement precision and hence, reduced sample size requirements compared to the QLQ-C30 scales. The CATs may be used as they are or modified to accommodate specific requirements. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
30. A Comparison of Response Time Threshold Scoring Procedures in Mitigating Bias From Rapid Guessing Behavior.
- Author
-
Rios, Joseph A. and Deng, Jiayi
- Subjects
- *
BEHAVIORAL assessment , *INTELLECT , *DESCRIPTIVE statistics , *ANALYSIS of variance , *REACTION time , *DATA analysis software , *DISCRIMINATION (Sociology) - Abstract
Rapid guessing (RG) is a form of non-effortful responding that is characterized by short response latencies. This construct-irrelevant behavior has been shown in previous research to bias inferences concerning measurement properties and scores. To mitigate these deleterious effects, a number of response time threshold scoring procedures have been proposed, which recode RG responses (e.g., treat them as incorrect or missing, or impute probable values) and then estimate parameters for the recoded dataset using a unidimensional or multidimensional IRT model. To date, there have been limited attempts to compare these methods under the possibility that RG may be misclassified in practice. To address this shortcoming, the present simulation study compared item and ability parameter recovery for four scoring procedures by manipulating sample size, the linear relationship between RG propensity and ability, the percentage of RG responses, and the type and rate of RG misclassifications. Results demonstrated two general trends. First, across all conditions, treating RG responses as incorrect produced the largest degree of combined systematic and random error (larger than ignoring RG). Second, the remaining scoring approaches generally provided equal accuracy in parameter recovery when RG was perfectly identified; however, the multidimensional IRT approach was susceptible to increased error as misclassification rates grew. Overall, the findings suggest that recoding RG as missing and employing a unidimensional IRT model is a promising approach. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
31. Classification of Parkinson's Disease Using Machine Learning with MoCA Response Dynamics.
- Author
-
Chudzik, Artur and Przybyszewski, Andrzej W.
- Subjects
PARKINSON'S disease ,MACHINE learning ,ROUGH sets ,MONTREAL Cognitive Assessment ,ALZHEIMER'S disease ,REACTION time ,DEEP brain stimulation - Abstract
Neurodegenerative diseases (NDs), including Parkinson's and Alzheimer's disease, pose a significant challenge to global health, and early detection tools are crucial for effective intervention. The adaptation of online screening forms and machine learning methods can lead to better and wider diagnosis, potentially altering the progression of NDs. Therefore, this study examines the diagnostic efficiency of machine learning models using Montreal Cognitive Assessment test results (MoCA) to classify scores of people with Parkinson's disease (PD) and healthy subjects. For data analysis, we implemented both rule-based modeling using rough set theory (RST) and classic machine learning (ML) techniques such as logistic regression, support vector machines, and random forests. Importantly, the diagnostic accuracy of the best performing model (RST) increased from 80.0% to 93.4% and diagnostic specificity increased from 57.2% to 93.4% when the MoCA score was combined with temporal metrics such as IRT—instrumental reaction time and TTS—submission time. This highlights that online platforms are able to detect subtle signs of bradykinesia (a hallmark symptom of Parkinson's disease) and use this as a biomarker to provide more precise and specific diagnosis. Despite the constrained number of participants (15 Parkinson's disease patients and 16 healthy controls), the results suggest that incorporating time-based metrics into cognitive screening algorithms may significantly improve their diagnostic capabilities. Therefore, these findings recommend the inclusion of temporal dynamics in MoCA assessments, which may potentially improve the early detection of NDs. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
32. Item response analysis with partial credit model, psychometric properties and measurement invariance of the multilanguage versions of intergenerational relationship quality scale in Iran.
- Author
-
Asadollahi, Abdolrahim, Pirzadeh, Nasim, and Abyad, Abdulrazzak
- Subjects
RELATIONSHIP quality ,PSYCHOMETRICS ,CREDIT analysis ,OLDER people ,INTERGENERATIONAL relations ,ADULTS - Abstract
This research aims to assess multilanguage versions of the IRQS-2018 to determine Intergenerational relationship quality in the older persons. In a psychometric investigation, the instrument was completed by 707 persons whose age is older than 60; those people were from 6 populated ethnic groups in Iran. It is noteworthy that the Rasch partial credit model (PCM) and the classic method were employed.The PCM showed that items 3 and 7 were misfitting in all versions of IRQS. Furthermore, consecutive response groupings for all items were situated in the predictable order, and the version of IRQS with 11-items had further interior consistency. Although Rasch analysis specified to pertinent of IRQS 11-Items, it should be assessed in additional investigations and deviating settings such as public residence grown-up adults. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
33. Infrared Thermography of Teat in French Dairy Alpine Goats: A Promising Tool to Study Animal–Machine Interaction during Milking but Not to Detect Mastitis.
- Author
-
Marnet, Pierre-Guy, Velasquez, Alejandro B., and Dzidic, Alen
- Subjects
- *
GOATS , *MASTITIS , *ANIMAL welfare , *THERMOGRAPHY , *MILKING machines , *GOAT milk - Abstract
Simple Summary: Infrared thermography (IRT) is a non-invasive technology that is of interest both for diagnosing mastitis and describing possible interactions between milking machine liners and teat tissue in cows. Very little was known about these applications in dairy goats, although this species is increasingly affected by mastitis, which is not only of infectious origin. This study aims to fill this gap by investigating the thermal responses of teats to the milking machine in goats with different levels of udder inflammation. IRT fails to detect mastitis early in goats and cannot be used for prophylactic purposes in goats. IRT measurements were influenced by milking, and the results differed between unbalanced glands and different teat shapes, indicating differences in the interaction of the machine with the teat tissue. The IRT, therefore, appears to be a good instrument for measuring the effects of the milking machine. In the future, it could help to better adapt the machine equipment and settings to the animals and improve the efficiency and well-being of the animals. There is a need to develop tools for mastitis management in goats and to measure the effects of milking machines on teats. Infrared thermography (IRT), as shown in cows, was a good candidate for early mastitis detection and focusing on milking equipment and settings implicated in potential problems. The aim of this study was to test IRT to detect udder inflammation and the effects of mechanical milking on teats in relation to inflammation status, udder balance, and teat shape in Alpine goats. IRT spectra were compared before and after milking in 551 goats from three commercial herds compared to their individual SCC (somatic cell count). We found no regression or trend between logSCC and IRT measurement or response to milking, even in highly inflamed goat udders. The effect of milking was significant (p < 0.05) with global temperature reduction after milking, but differences were seen between teat parts and unbalanced half udders. The highest reduction in skin temperature was observed at the teat orifice (−1.06 ± 0.05) and the lowest at the teat barrel (−0.37 ± 0.05). The teats with long barrels showed more IRT reactions, which clearly indicates poor adaptation to the liners used. In conclusion, the IRT was not able to detect mastitis, but it is a good tool to diagnose the effects of the milking machine in order to adapt milking equipment and settings to the goats and improve their welfare. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
34. Prediction of cognitive impairment using higher order item response theory and machine learning models.
- Author
-
Lihua Yao, Yusuke Shono, Nowinski, Cindy, Dworak, Elizabeth M., Kaat, Aaron, Chen, Shirley, Lovett, Rebecca, Ho, Emily, Curtis, Laura, Wolf, Michael, Gershon, Richard, and Benavente, Julia Yoshino
- Subjects
MACHINE learning ,ITEM response theory ,MACHINE theory ,COGNITION disorders ,COGNITIVE flexibility ,EPISODIC memory - Abstract
Timely detection of cognitive impairment (CI) is critical for the wellbeing of elderly individuals. The MyCog assessment employs two validated iPad-based measures from the NIH Toolbox
® for Assessment of Neurological and Behavioral Function (NIH Toolbox). These measures assess pivotal cognitive domains: Picture Sequence Memory (PSM) for episodic memory and Dimensional Change Card Sort Test (DCCS) for cognitive flexibility. The study involved 86 patients and explored diverse machine learning models to enhance CI prediction. This encompassed traditional classifiers and neural-network-based methods. After 100 bootstrap replications, the Random Forest model stood out, delivering compelling results: precision at 0.803, recall at 0.758, accuracy at 0.902, F1 at 0.742, and specificity at 0.951. Notably, the model incorporated a composite score derived from a 2-parameter higher order item response theory (HOIRT) model that integrated DCCS and PSM assessments. The study's pivotal finding underscores the inadequacy of relying solely on a fixed composite score cutoff point. Instead, it advocates for machine learning models that incorporate HOIRT-derived scores and encompass relevant features such as age. Such an approach promises more effective predictive models for CI, thus advancing early detection and intervention among the elderly. [ABSTRACT FROM AUTHOR]- Published
- 2024
- Full Text
- View/download PDF
35. Numeric rating scale for pain should be used in an ordinal but not interval manner. A retrospective analysis of 346,892 patient reports of the quality improvement in postoperative pain treatment registry.
- Author
-
Stijic, Marko, Messerer, Brigitte, Meißner, Winfried, and Avian, Alexander
- Subjects
- *
POSTOPERATIVE pain treatment , *ITEM response theory , *POSTOPERATIVE pain , *AGE groups , *RETROSPECTIVE studies - Abstract
To assess postoperative pain intensity in adults, the numeric rating scale (NRS) is used. This scale has shown acceptable psychometric features, although its scale properties need further examination. We aimed to evaluate scale properties of the NRS using an item response theory (IRT) approach. Data from an international postoperative pain registry (QUIPS) was analyzed retrospectively. Overall, 346,892 adult patients (age groups: 18-20 years: 1.6%, 21-30 years: 6.7%, 31-40 years: 8.3%, 41-50 years: 13.2%, 51-60 years: 17.1%, 61-70 years: 17.3%, 71-80 years: 16.4%, 81-90 years: 3.9%, >90: 0.2%) were included. Among the patients, 55.7% are female and 38% had preoperative pain. Three pain items (movement pain, worst pain, least pain) were analyzed using 4 different IRT models: partial credit model (PCM), generalized partial credit model (GPCM), rating scale model (RSM), and graded response model (GRM). Fit indices were compared to decide the best fitting model (lower fit indices indicate a better model fit). Subgroup analyses were done for sex and age groups. After collapsing the highest and the second highest response category, the GRM outperformed other models (lowest Bayesian information criterion) in all subgroups. Overlapping categories were found in category boundary curves for worst and minimum pain and particularly for higher pain ratings. Response category widths differed depending on pain intensity. For female, male, and age groups, similar results were obtained. Response categories on the NRS are ordered but have different widths. The interval scale properties of the NRS should be questioned. In dealing with missing linearity in pain intensity ratings using the NRS, IRT methods may be helpful. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
36. Harmonizing Assessments of Everyday Racial Discrimination Experiences: The Multigroup Everyday Racial Discrimination Scale (MERDS).
- Author
-
Lui, P. Priscilla and Kamata, Akihito
- Subjects
- *
RACISM , *EXPERIMENTAL design , *COLLEGE students , *MINORITIES , *RESEARCH methodology , *RESEARCH methodology evaluation , *ATTITUDE (Psychology) , *RACE , *GROUP identity , *EXPERIENCE , *PSYCHOMETRICS , *MULTITRAIT multimethod techniques , *BEHAVIOR disorders , *PSYCHOSOCIAL factors , *FACTOR analysis , *PATHOLOGICAL psychology , *ETHNIC groups , *HEALTH equity , *MICROAGGRESSIONS , *MEASUREMENT errors - Abstract
Reliable and valid assessment of direct racial discrimination experiences in everyday life is critical to understanding one key determinant of ethnoracial minority health and health disparities. To address psychometric limitations of existing instruments and to harmonize the assessment of everyday racial discrimination, the new Multigroup Everyday Racial Discrimination Scale (MERDS) was developed and validated. This investigation included 1,355 college and graduate students of color (M age = 21.54, 56.0% women). Factor analyses were performed to provide evidence for structural validity of everyday racial discrimination scores. Item response theory modeling was used to investigate item difficulty relative to the level of everyday racial discrimination, and measurement error conditioned on the construct. MERDS scores were reliable, supported construct unidimensionality, and distinguished individuals who reported low to very high frequency of everyday racial discrimination. Results on the associations with racial identity and psychopathology symptoms, and utility of the scale are discussed. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
37. Psychometric properties of the Cultural Intelligence Scale based on item response theory.
- Author
-
Darandari, Eqbal and Khayat, Shatha
- Subjects
- *
ITEM response theory , *CULTURAL intelligence , *PSYCHOMETRICS , *CLASSICAL test theory , *CULTURAL property - Abstract
The aim of this study was to investigate the psychometric properties of Cultural Intelligence Scale (CQS), based on item response theory (IRT) using the graded response model (GRM). The study calibration sample included 400, while the study sample included 1000, male and female Saudi participants, aged between 18 and 62 years. IRT‐GRM results supported the quality of the psychometric properties of CQS, and its appropriateness to measure cultural intelligence (CQ) for the majority of individuals. CQS well‐distinguished people at different ability levels along the CQ latent trait, particularly with middle and low abilities. However, CQS full scale and subscales had less accurate measurement precision at high levels of CQ, and some subscales had more precision at low level abilities. CQS items had medium ability to differentiate among subjects, and they provided more information in evaluating individuals with medium CQ. Therefore, CQS might be more suitable for identification and development purposes, where low to med‐levels of CQ are expected. Additional assessment procedures need to be added, for selection or promotion purposes to increase the measurement precision. Confirmatory factor analysis results confirmed the multidimensional construct of CQS with four specific‐related factors at the first level, and an aggregate factor at the second level. This model provided better model fit using IRT‐GRM approach, and it was supported by classical test theory analysis results. Therefore, it is important to rely on subscale scores, besides the total score to interpret CQ for individuals. The study stressed the importance of examining CQS item parameters and information based on the country it is adapted for, to investigate how they interact with country culture; and to take into account ability level, when selecting optimal measures. Practitioner points: The results of confirmatory factor analysis (CFA) and item response theory (IRT) graded response model (GRM) supported the multidimensional construct of Cultural Intelligence Scale (CQS) with four specific‐related factors at the first level, and an aggregate factor at the second level, that was proposed by the CQS theory, compared to other models.IRT‐GRM analysis results in this study indicated that CQS has good psychometric properties and indicated that it appears to be a valid and moderately reliable instrument in detecting Cultural intelligence (CQ). These results were supported by CTT analysis results.IRT‐GRM analysis results showed that CQS well‐distinguished people at different ability levels along the CQ latent trait, particularly with middle and low abilities. However, CQS full scale and subscales had less accurate measurement precision at high levels of CQ, and some subscales had more precision at low level abilities.The study suggested that it is important to examine the CQS model fit, item parameters, information functions for the full scale and subscales based on the country it is adapted for, before considering it. It is important also to rely on the subscale scores to interpret CQ for individuals, and to identify their strengths, rather than relying on the total score alone.The study results suggested that CQS suites better CQ identification and development purposes. For selection or promotion purposes, it is suggested to add additional assessment procedures to increase the measurement precision. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
38. Genotype-Environment Interaction in ADHD: Genetic Predisposition Determines the Extent to Which Environmental Influences Explain Variability in the Symptom Dimensions Hyperactivity and Inattention.
- Author
-
Schwabe, Inga, Jović, Miljan, Rimfeld, Kaili, Allegrini, Andrea G., and van den Berg, Stéphanie M.
- Subjects
- *
GENOTYPE-environment interaction , *ATTENTION-deficit hyperactivity disorder , *LATENT class analysis (Statistics) , *ITEM response theory , *HYPERACTIVITY , *SYMPTOMS , *GENETIC correlations , *HERITABILITY - Abstract
Although earlier research has shown that individual differences on the spectrum of attention deficit hyperactivity disorder (ADHD) are highly heritable, emerging evidence suggests that symptoms are associated with complex interactions between genes and environmental influences. This study investigated whether a genetic predisposition [Note that the term 'genetic predisposition' was used in this manuscript to refer to an estimate based on twin modeling (an individual's score on the latent trait that resembles additive genetic influences) in the particular population being examined.] for the symptom dimensions hyperactivity and inattention determines the extent to which unique-environmental influences explain variability in these symptoms. To this purpose, we analysed a sample drawn from the Twins Early Development Study (TEDS) that consisted of item-level scores of 2168 16-year-old twin pairs who completed both the Strengths and Difficulties Questionnaire (SDQ; Goodman, in J Child Psychol Psychiatry 38:581–586, 1997) and the Strength and Weaknesses of ADHD Symptoms and Normal Behavior (SWAN; Swanson, in Paper presented at the meeting of the American Psychological Association, Los Angeles, 1981) questionnaire. To maximize the psychometric information to measure ADHD symptoms, psychometric analyses were performed to investigate whether the items from the two questionnaires could be combined to form two longer subscales. In the estimation of genotype-environment interaction, we corrected for error variance heterogeneity in the measurement of ADHD symptoms through the application of item response theory (IRT) measurement models. A positive interaction was found for both hyperactivity (e.g., β 1 = 2.20 with 95% highest posterior density interval equal to [1.79;2.65] and effect size equal to 3.00) and inattention (e.g., β 1 = 2.16 with 95% highest posterior density interval equal to [1.56;2.79] and effect size equal to 3.07). These results indicate that unique-environmental influences were more important in creating individual differences in both hyperactivity and inattention for twins with a genetic predisposition for these symptoms than for twins without such a predisposition. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
39. Harmonizing the CBCL and SDQ ADHD scores by using linear equating, kernel equating, item response theory and machine learning methods
- Author
-
Miljan Jović, Maryam Amir Haeri, Andrew Whitehouse, and Stéphanie M. van den Berg
- Subjects
data harmonization ,test equating ,machine learning ,IRT ,linear equating ,kernel equating ,Psychology ,BF1-990 - Abstract
IntroductionA problem that applied researchers and practitioners often face is the fact that different institutions within research consortia use different scales to evaluate the same construct which makes comparison of the results and pooling challenging. In order to meaningfully pool and compare the scores, the scales should be harmonized. The aim of this paper is to use different test equating methods to harmonize the ADHD scores from Child Behavior Checklist (CBCL) and Strengths and Difficulties Questionnaire (SDQ) and to see which method leads to the result.MethodsSample consists of 1551 parent reports of children aged 10-11.5 years from Raine study on both CBCL and SDQ (common persons design). We used linear equating, kernel equating, Item Response Theory (IRT), and the following machine learning methods: regression (linear and ordinal), random forest (regression and classification) and Support Vector Machine (regression and classification). Efficacy of the methods is operationalized in terms of the root-mean-square error (RMSE) of differences between predicted and observed scores in cross-validation.Results and discussionResults showed that with single group design, it is the best to use the methods that use item level information and that treat the outcome as interval measurement level (regression approach).
- Published
- 2024
- Full Text
- View/download PDF
40. A novel application of deep learning approach over IRT images for the automated detection of rising damp on historical masonries
- Author
-
Emmanouil Alexakis, Ekaterini T. Delegou, Philip Mavrepis, Antonis Rifios, Dimosthenis Kyriazis, and Antonia Moropoulou
- Subjects
Artificial intelligence ,Computer vision ,Infra-red thermography ,IRT ,Rising damp ,Moisture ,Materials of engineering and construction. Mechanics of materials ,TA401-492 - Abstract
Nowadays, the fusion of Artificial Intelligence (AI) comprises a widespread approach for resolving various types of problems in many scientific domains including Protection of Monuments. Non-Destructive Testing (NDT) approaches and Infra-Red Thermography (IRT) specifically, plays a key role for the diagnosis and the assessment of the monuments’ preservation state. Additionally, IRT comprises a powerful tool for continuous monitoring especially when it concerns the physical and/or chemical processes that take place within or on the material and affect the irradiation of the historical surfaces. This study explores the application of Deep Learning (DL) to IRT images of passive approach, focusing on the automated detection of rising damp in historical masonries. The IRT data were acquired from two monuments, the Holy Aedicule of the Holy Sepulchre and the Historical Building ''Msma’a''. Exploiting the capabilities of AI for enhancing the non-intrusive nature of passive IRT, this research seeks to provide a cost-effective and non-destructive approach for the early identification of rising damp, contributing significantly to the long-term preservation, conservation, and protection of the cultural heritage. To achieve this, the study takes advantage of a combination of the PSPNet image segmentation model with the ResNet-50 backbone, the PSP_R50 model. The mmsegment framework, renowned for its versatility and effectiveness, serves as the ideal platform for training, evaluating, and fine-tuning the proposed segmentation model. Despite having a relatively small dataset, a highly effective segmentation model (0.93 accuracy, 0.89 IoU), has been successfully developed.
- Published
- 2024
- Full Text
- View/download PDF
41. Development of E-Assessment Instruments for Assessing Metacognition Skills of Students in The Research Methodology Course
- Author
-
Nurnaningsih Nurnaningsih and Amrin Amrin
- Subjects
metacognition ,instruments ,irt ,polytomus ,History (General) ,D1-2009 - Abstract
This research is focused on the creation of a valid and dependable tool designed for the assessment of students' metacognitive abilities within the context of the Research Methodology course. The research methodology employed for this research follows the Research and Development (R&D) framework outlined by Mardapi, which encompasses ten distinct phases. The participants involved in this research comprised four lecturers responsible for instructing research methodology courses and a total of 61 students who were enrolled in these courses. The data collected for this research consisted of quantitative data acquired through expert validation questionnaires and trial instruments. Data analysis was conducted employing quantitative techniques, specifically utilizing Microsoft Excel and employing the Item Response Theory (IRT) Politomus data analysis approach within the R programming environment. The research outcomes indicated that the instrument employed to evaluate students' metacognitive skills in the research methodology course achieved a valid status as per expert evaluations, meeting the criteria for goodness. It was found to be valid in terms of the response distribution across all 24 items, and collectively, the instrument items were deemed capable of offering insights into the state of the test participants (respondents) by more than 85%. Moreover, the instrument demonstrated a very high level of reliability as 0,94.
- Published
- 2024
- Full Text
- View/download PDF
42. Asymptotically Correct Person Fit z-Statistics For the Rasch Testlet Model
- Author
-
Lin, Zhongtian, Jiang, Tao, Rijmen, Frank, and Van Wamelen, Paul
- Published
- 2024
- Full Text
- View/download PDF
43. A scoping review of Rasch analysis and item response theory in otolaryngology: Implications and future possibilities.
- Author
-
Liu, David T., Mueller, Christian A., and Sedaghat, Ahmad R.
- Subjects
- *
ITEM response theory , *MEDLINE , *OTOLARYNGOLOGY , *SCIENTIFIC literature , *EAR , *PLASTIC surgery - Abstract
Objective: Item response theory (IRT) is a methodological approach to studying the psychometric performance of outcome measures. This study aims to determine and summarize the use of IRT in otolaryngological scientific literature. Methods: A systematic search of the Medline, Embase, and the Cochrane Library databases was performed for original English‐language published studies indexed up to January 28, 2023, per the following search strategy: ("item response theory" OR "irt" OR "rasch" OR "latent trait theory" OR "modern mental test theory") AND ("ent" OR "otorhinolaryngology" OR "ear" OR "nose" OR "throat" OR "otology" OR "audiology" OR "rhinology" OR "laryngology" OR "neurotology" OR "facial plastic surgery"). Results: Fifty‐five studies were included in this review. IRT was used across all subspecialties in otolaryngology, and most studies utilizing IRT methodology were published within the last decade. Most studies analyzed polytomous response data, and the most commonly used IRT models were the partial credit and the rating scale model. There was considerable heterogeneity in reporting the main assumptions and results of IRT. Conclusion: IRT is increasingly being used in the otolaryngological scientific literature. In the otolaryngology literature, IRT is most frequently used in the study of patient‐reported outcome measures and many different IRT‐based methods have been used. Future IRT‐based outcome studies, using standardized reporting guidelines, might improve otolaryngology‐outcome research sustainably by improving response rates and reducing patient response burden. Level of evidence: 2. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
44. Item response theory to discriminate COVID-19 knowledge and attitudes among university students.
- Author
-
Wesonga, Ronald, Islam, M. Mazharul, Hasani, Iman Al, and Manei, Afra Al
- Subjects
STUDENT attitudes ,COLLEGE student attitudes ,ITEM response theory ,COVID-19 ,COVID-19 pandemic ,LATENT variables ,MAXIMUM likelihood statistics - Abstract
This article discusses a study conducted at Sultan Qaboos University in Oman that aimed to assess the knowledge and attitudes of university students towards COVID-19. The study used item response theory (IRT) models, specifically the Rasch and 2PL models, to measure the difficulty and discrimination of knowledge and attitude items related to the virus. The results indicated that the 2PL model was more effective in assessing COVID-19 knowledge and attitudes, and that attitude items received more reliable responses than knowledge items. The study highlights the importance of carefully designing knowledge items to ensure accurate and evolving knowledge during a pandemic. Further research is recommended to explore why it is easier to measure attitudes than knowledge in most studies. [Extracted from the article]
- Published
- 2024
- Full Text
- View/download PDF
45. Optimization of performance of Dutch newborn screening for cystic fibrosis.
- Author
-
Bouva, MJ, Dankert-Roelse, JE, van der Ploeg, CPB, Verschoof-Puite, RK, Zomer-van Ommen, DD, Gille, JJP, Jakobs, BS, Heijnen, MLA, and de Winter-de Groot, KM
- Subjects
- *
NEWBORN screening , *CYSTIC fibrosis , *DNA analysis , *METABOLIC syndrome - Abstract
• Until 2016, Dutch NBS for CF, as implemented in 2011, showed a sensitivity of 90%. • In 2016, NBS for CF algorithm was modified, resulting in a sensitivity of 95%. • Sensitivity improved by changing PAP cut-off values. • Costs per screened child and PPV did not change substantially. • With every twelve CF patients Dutch NBS refers one CRMS/CFSPID and three carriers. Dutch newborn screening (NBS) for Cystic Fibrosis (CF) introduced in 2011 showed a sensitivity of 90% and a positive predictive value (PPV) of 63%. We describe a study including an optimization phase and evaluation of the modified protocol. Dutch protocol consists of four steps: determination of immunoreactive trypsinogen (IRT) and pancreatitis-associated protein (PAP), DNA analysis by INNO-LiPA and extended gene analysis (EGA). For the optimization phase we used results of 556,952 newborns screened between April 2011 and June 2014 to calculate effects of 13 alternative protocols on sensitivity, specificity, PPV, ratios of CF to other diagnoses, and costs. One alternative protocol was selected based on calculated sensitivity, PPV and costs and was implemented on 1st July 2016. In this modified protocol DNA analysis is performed in samples with a combination of IRT ≥60 µg/l and PAP ≥3.0 µg/l, IRT ≥100 µg/l and PAP ≥1.2 µg/l or IRT ≥124 µg/l and PAP not relevant. Results of 599,137 newborns screened between 1st July 2016 and 31st December 2019 were similarly evaluated as in the optimization phase. The modified protocol showed a sensitivity of 95%, PPV of 76%, CF to CF transmembrane conductance regulator-related metabolic syndrome/CF screen positive, inconclusive diagnoses (CRMS/CFSPID) ratio 12/1, CF/CF carrier ratio 4/1. Costs per screened newborn were slightly higher. Eleven children, of whom five with classic CF, would not have been referred with the previous protocol. The modified protocol results in acceptable sensitivity (95%) and good PPV of 76% with minimal increase in costs. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
46. Psychometric Properties and Factorial Invariance of the Educational-Clinical Questionnaire: Anxiety and Depression (CECAD) in Basque Population.
- Author
-
Gorostiaga, Arantxa, Balluerka, Nekane, Aliri, Jone, Echeveste, Usue, and Lameirinhas, Joanes
- Subjects
- *
ANXIETY , *MENTAL illness , *MENTAL depression , *BASQUES , *TEST validity , *PSYCHOMETRICS , *LEGAL evidence , *QUESTIONNAIRES - Abstract
Background: Anxiety and depression are the most common current mental health problems. Due to their comorbidity, there is a need for instruments that measure them simultaneously. Moreover, given that their prevalence varies by gender and age, it is important to examine the factorial invariance of such instruments. The present study aimed to analyze the dimensionality and factorial invariance of the Basque version of the Educational-Clinical Questionnaire: Anxiety and Depression (CECAD) as a function of gender and age, and to gather additional evidence of its validity. Method: The sample comprised 2131 participants (54.2% female) between 7 and 24 years old (M = 13.2; SD = 3.52). Results: The CECAD was found to have a two-dimensional structure invariant to gender and age, with higher latent means for girls in both dimensions, and for those aged 14 and over in depression, but with small effect sizes. Both reliability and convergent validity values were good. Conclusions: The Basque version of the CECAD has good evidence of validity and reliability for assessing anxiety and depression in Basque-speaking children and adolescents. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
47. Beyond Continuous versus Categorical Dichotomy: Uncovering Latent Structure of Non-electoral Political Participation Using Zero-Inflated Models.
- Author
-
Koc, Piotr
- Subjects
- *
POLITICAL participation , *LATENT variables , *REGRESSION analysis , *RESEARCH personnel , *MODEL theory , *VOTER turnout - Abstract
While modeling political participation as a latent variable, researchers usually choose whether to conceptualize and model participation as a latent continuous or latent categorical variable. When participation is modeled as a continuous variable, factor analytic and item-response theory models are used. When modeled as a categorical variable, latent class analysis is employed. However, both conceptualizations and modeling approaches rest upon very strong assumptions. In the continuous case, all subjects are assumed to come from the same homogenous population; in the categorical case, we assume that no quantitative heterogeneity exists within the latent classes. In this work, I argue that these assumptions are implausible and propose to model participation using zero-inflated measurement and regression models that assume the existence of two latent classes-politically disengaged and politically active-with the latter class being quantitatively heterogenous (people in that class are thought to participate to a varying degree). The results show that the models accounting for the latent class of politically disengaged have much better out-of-sample predictive accuracy. Moreover, modeling the zero-inflation changes estimates of measurement and regression models, and offers new research opportunities because with zero-inflated models we can explicitly tackle the question of what impacts the probability of ending up in the latent class of politically disengaged. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
48. A 4pL item response theory examination of perceived stigma in the screening of eating disorders with the SCOFF among college students.
- Author
-
Barnard-Brak, Lucy and Yang, Zhanxia
- Abstract
We examined the psychometric properties of the SCOFF, a screening instrument for eating disorders, with consideration of the perceived stigma of items that can produce socially desirable responding among a sample of college students. The results of the current study suggest evidence of the sufficient psychometric properties of the SCOFF in terms of confirmatory factor and item response theory analyses. However, two items of the SCOFF revealed that individuals who otherwise endorsed other items of the SCOFF were less likely to endorse the items of Fat and Food. It is hypothesized that this is the result of perceived stigma regarding those two items that prompts individuals to respond in a socially desirable way. A weighted scoring procedure was developed to counteract the performance of these two items, but the psychometric performance was only slightly better and there would be a clear tradeoff of specificity over sensitivity if utilized. Future research should consider other ways to counteract such perceived stigma. Level of evidence Level III: Evidence obtained from cohort or case–control analytic studies. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
49. Psychometric properties and item response theory analysis of the Persian version of the social pain questionnaire
- Author
-
Mahya Sepehrinia, Hojjatollah Farahani, Peter Watson, and Nasim Amini
- Subjects
social pain ,ostracism ,social exclusion ,rejection ,psychometric ,IRT ,Psychology ,BF1-990 - Abstract
IntroductionSocial pain is an emotional reaction which is triggered by social exclusion and has been extensively investigated in the literature. The Social Pain Questionnaire (SPQ) is a self-report instrument which is the only scale for measuring social pain as a dispositional factor. The current study aimed at examining the psychometric properties of the SPQ in an Iranian sample.Materials and methodsA sample of participants (N = 400) was recruited in a cross-sectional validation study. Exploratory Factor Analysis (EFA) as well as Confirmatory Factor Analysis (CFA) were conducted. The Item Response Theory (IRT) model parameters were evaluated and item response category curves were presented. Convergent and divergent validities as well as the reliability (by using Cronbach’s alpha coefficient) were also assessed.ResultsThe SPQ’s unidimensionality was affirmed (RMSEA = 0.078; CFI = 0.915; TLI = 0.99) and its internal consistency was robust (Cronbach’s α = 0.94). The correlation between the SPQ and the following measures endorsed its divergent and convergent validity: Self-esteem (r = −0.424), Perceived Social Support (r = −0.161), and Interpersonal Sensitivity (r = 0.636). Finally, Item Response Theory Analysis emphasized the effectiveness of the SPQ items in discerning various levels of social pain. The theta level ranged between −1 and + 1.2 and the IRT-based marginal reliability was 0.92 for the total score.DiscussionThe Persian SPQ stands as a reliable and valid measure for evaluating social pain. This scale has the potential to stimulate further research in the field for both clinical and non-clinical settings.ConclusionBy employing Item Response Theory (IRT) analysis, we have transcended the theoretical psychometric evaluation of the SPQ scale and demonstrated that SPQ is a unidimensional, valid and reliable measurement tool.
- Published
- 2024
- Full Text
- View/download PDF
50. A 4pL item response theory examination of perceived stigma in the screening of eating disorders with the SCOFF among college students
- Author
-
Lucy Barnard-Brak and Zhanxia Yang
- Subjects
Eating disorders ,Perceived stigma ,SCOFF ,Item response theory ,IRT ,Nutritional diseases. Deficiency diseases ,RC620-627 - Abstract
Abstract We examined the psychometric properties of the SCOFF, a screening instrument for eating disorders, with consideration of the perceived stigma of items that can produce socially desirable responding among a sample of college students. The results of the current study suggest evidence of the sufficient psychometric properties of the SCOFF in terms of confirmatory factor and item response theory analyses. However, two items of the SCOFF revealed that individuals who otherwise endorsed other items of the SCOFF were less likely to endorse the items of Fat and Food. It is hypothesized that this is the result of perceived stigma regarding those two items that prompts individuals to respond in a socially desirable way. A weighted scoring procedure was developed to counteract the performance of these two items, but the psychometric performance was only slightly better and there would be a clear tradeoff of specificity over sensitivity if utilized. Future research should consider other ways to counteract such perceived stigma. Level of evidence Level III: Evidence obtained from cohort or case–control analytic studies.
- Published
- 2023
- Full Text
- View/download PDF
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.