13,372 results on '"Monte Carlo Methods"'
Search Results
2. The Sample Is Not the Population
- Author
-
J. S. Allison, L. Santana, and I. J. H. Visagie
- Abstract
Given sample data, how do you calculate the value of a parameter? While this question is impossible to answer, it is frequently encountered in statistics classes when students are introduced to the distinction between a sample and a population (or between a statistic and a parameter). It is not uncommon for teachers of statistics to also confuse these concepts. An excerpt of a national mathematics examination paper, where a sample is mistaken for the population, is used to illustrate this confusion as well as sample variation and its link to sample size. We discuss two techniques that can be used to explain the difference between a parameter and a statistic. The first is a visual technique in which the variability in calculated statistics is contrasted to the fixed value of the corresponding parameter. Thereafter, we discuss Monte Carlo simulation techniques and explain the contribution that these methods may have.
- Published
- 2025
- Full Text
- View/download PDF
3. Finding Representative Group Fairness Metrics Using Correlation Estimations
- Author
-
Hadis Anahideh, Nazanin Nezami, and Abolfazl Asudeh
- Abstract
It is of critical importance to be aware of the historical discrimination embedded in the data and to consider a fairness measure to reduce bias throughout the predictive modeling pipeline. Given various notions of fairness defined in the literature, investigating the correlation and interaction among metrics is vital for addressing unfairness. Practitioners and data scientists should be able to comprehend each metric and examine their impact on one another given the context, use case, and regulations. Exploring the combinatorial space of different metrics for such examination is burdensome. To alleviate the burden of selecting fairness notions for consideration, we propose a framework that estimates the correlation among fairness notions. Our framework consequently identifies a set of diverse and semantically distinct metrics as representative of a given context. We propose a Monte Carlo sampling technique for computing the correlations between fairness metrics by indirect and efficient perturbation in the model space. Using the estimated correlations, we then find a subset of representative metrics. The paper proposes a generic method that can be generalized to any arbitrary set of fairness metrics. We showcase the validity of the proposal using comprehensive experiments on real-world benchmark datasets.
- Published
- 2025
- Full Text
- View/download PDF
4. The Impact of 'Negligible' Cross-Loadings in Investigations of Measurement Invariance with MGCFA and MGESEM
- Author
-
Timothy R. Konold, Elizabeth A. Sanders, and Kelvin Afolabi
- Abstract
Measurement invariance (MI) is an essential part of validity evidence concerned with ensuring that tests function similarly across groups, contexts, and time. Most evaluations of MI involve multigroup confirmatory factor analyses (MGCFA) that assume simple structure. However, recent research has shown that constraining non-target indicators to zero when cross-loadings are present results in biased estimates of latent variable associations. Using Monte Carlo simulation, we investigate the behavior of fit statistics for identifying non-invariance when the target measurement model is the same for both groups, and the source of non-invariance is the presence and magnitude of non-zero cross-loadings in one group, but not in another. We consider differences between MGCFA and multigroup ESEM (MGESEM), and combined and separate group tests of configural invariance. Implications for applied researchers are provided.
- Published
- 2025
- Full Text
- View/download PDF
5. Enhancing Model Fit Evaluation in SEM: Practical Tips for Optimizing Chi-Square Tests
- Author
-
Bang Quan Zheng and Peter M. Bentler
- Abstract
This paper aims to advocate for a balanced approach to model fit evaluation in structural equation modeling (SEM). The ongoing debate surrounding chi-square test statistics and fit indices has been characterized by ambiguity and controversy. Despite the acknowledged limitations of relying solely on the chi-square test, its careful application can enhance its effectiveness in evaluating model fit and specification. To illustrate this point, we present three common scenarios relevant to social and behavioral science research using Monte Carlo simulations, where fit indices may inadequately address concerns regarding goodness-of-fit, while the chi-square statistic can offer valuable insights. Our recommendation is to report both the chi-square test and fit indices, prioritizing precise model specification to ensure the reliability of model fit indicators.
- Published
- 2025
- Full Text
- View/download PDF
6. Perceived Discrimination and Poor Children's Executive Function: The Different Roles of Self-Esteem and Perceived Social Support
- Author
-
Jiatian Zhang, Yi Ren, Yiyi Deng, and Silin Huang
- Abstract
The negative effect of poverty on children's cognitive development has been proven, but few studies have examined the potential role of perceived poverty discrimination on poor children's cognitive development. This study investigated the effect of perceived discrimination on executive function, the mediating effect of self-esteem and the moderating effect of perceived social support among 711 children aged 8-13 (M = 9.97 years, SD = 1.19 years, girls: 48.80%) from a Chinese impoverished county. The results indicated that (1) perceived discrimination was negatively associated with children's executive function; (2) self-esteem partially mediated this association; and (3) perceived social support moderated the relation between perceived discrimination and children's self-esteem: high levels of perceived social support increased self-esteem for poor children with more perceived discrimination. The results suggested that self-esteem is a mechanism underlying the negative association between perceived discrimination and children's executive function and perceived social support plays a protective moderating role.
- Published
- 2025
- Full Text
- View/download PDF
7. Comparing Accuracy of Parallel Analysis and Fit Statistics for Estimating the Number of Factors with Ordered Categorical Data in Exploratory Factor Analysis
- Author
-
Hyunjung Lee and Heining Cham
- Abstract
Determining the number of factors in exploratory factor analysis (EFA) is crucial because it affects the rest of the analysis and the conclusions of the study. Researchers have developed various methods for deciding the number of factors to retain in EFA, but this remains one of the most difficult decisions in the EFA. The purpose of this study is to compare the parallel analysis with the performance of fit indices that researchers have started using as another strategy for determining the optimal number of factors in EFA. The Monte Carlo simulation was conducted with ordered categorical items because there are mixed results in previous simulation studies, and ordered categorical items are common in behavioral science. The results of this study indicate that the parallel analysis and the root mean square error of approximation (RMSEA) performed well in most conditions, followed by the Tucker-Lewis index (TLI) and then by the comparative fit index (CFI). The robust corrections of CFI, TLI, and RMSEA performed better in detecting misfit underfactored models than the original fit indices. However, they did not produce satisfactory results in dichotomous data with a small sample size. Implications, limitations of this study, and future research directions are discussed.
- Published
- 2024
- Full Text
- View/download PDF
8. Rotation Local Solutions in Multidimensional Item Response Theory Models
- Author
-
Hoang V. Nguyen and Niels G. Waller
- Abstract
We conducted an extensive Monte Carlo study of factor-rotation local solutions (LS) in multidimensional, two-parameter logistic (M2PL) item response models. In this study, we simulated more than 19,200 data sets that were drawn from 96 model conditions and performed more than 7.6 million rotations to examine the influence of (a) slope parameter sizes, (b) number of indicators per factor (trait), (c) probabilities of cross-loadings, (d) factor correlation sizes, (e) model approximation error, and (f) sample sizes on the local solution rates of the oblimin and (oblique) geomin rotation algorithms. To accommodate these design variables, we extended the standard M2PL model to include correlated major factors and uncorrelated minor factors (to represent model error). Our results showed that both rotation methods converged to LS under some conditions with geomin producing the highest local solution rates across many models. Our results also showed that, for identical item response patterns, rotation LS can produce different latent trait estimates with different levels of measurement precision (as indexed by the conditional standard error of measurement). Follow-up analyses revealed that when rotation algorithms converged to multiple solutions, quantitative indices of structural fit, such as numerical measures of simple structure, will often misidentify the rotation that is closest in mean-squared error to the factor pattern (or item-slope pattern) of the data-generating model.
- Published
- 2024
- Full Text
- View/download PDF
9. Two-Method Measurement Planned Missing Data with Purposefully Selected Samples
- Author
-
Menglin Xu and Jessica A. R. Logan
- Abstract
Research designs that include planned missing data are gaining popularity in applied education research. These methods have traditionally relied on introducing missingness into data collections using the missing completely at random (MCAR) mechanism. This study assesses whether planned missingness can also be implemented when data are instead designed to be purposefully missing based on student performance. A research design with purposefully selected missingness would allow researchers to focus all assessment efforts on a target sample, while still maintaining the statistical power of the full sample. This study introduces the method and demonstrates the performance of the purposeful missingness method within the two-method measurement planned missingness design using a Monte Carlo simulation study. Results demonstrate that the purposeful missingness method can recover parameter estimates in models with as much accuracy as the MCAR method, across multiple conditions.
- Published
- 2024
- Full Text
- View/download PDF
10. Comparison of Item Response Theory Ability and Item Parameters According to Classical and Bayesian Estimation Methods
- Author
-
Eray Selçuk and Ergül Demir
- Abstract
This research aims to compare the ability and item parameter estimations of Item Response Theory according to Maximum likelihood and Bayesian approaches in different Monte Carlo simulation conditions. For this purpose, depending on the changes in the priori distribution type, sample size, test length, and logistics model, the ability and item parameters estimated according to the maximum likelihood and Bayesian method and the differences in the RMSE of these parameters were examined. The priori distribution (normal, left-skewed, right-skewed, leptokurtic, and platykurtic), test length (10, 20, 40), sample size (100, 500, 1000), logistics model (2PL, 3PL). The simulation conditions were performed with 100 replications. Mixed model ANOVA was performed to determine RMSE differentiations. The prior distribution type, test length, and estimation method in the differentiation of ability parameter and RMSE were estimated in 2PL models; the priori distribution type and test length were significant in the differences in the ability parameter and RMSE estimated in the 3PL model. While prior distribution type, sample size, and estimation method created a significant difference in the RMSE of the item discrimination parameter estimated in the 2PL model, none of the conditions created a significant difference in the RMSE of the item difficulty parameter. The priori distribution type, sample size, and estimation method in the item discrimination RMSE were estimated in the 3PL model; the a priori distribution and estimation method created significant differentiation in the RMSE of the lower asymptote parameter. However, none of the conditions significantly changed the RMSE of item difficulty parameters.
- Published
- 2024
11. The Feasibility of Computerized Adaptive Testing of the National Benchmark Test: A Simulation Study
- Author
-
Musa Adekunle Ayanwale and Mdutshekelwa Ndlovu
- Abstract
The COVID-19 pandemic has had a significant impact on high-stakes testing, including the national benchmark tests in South Africa. Current linear testing formats have been criticized for their limitations, leading to a shift towards Computerized Adaptive Testing [CAT]. Assessments with CAT are more precise and take less time. Evaluation of CAT programs requires simulation studies. To assess the feasibility of implementing CAT in NBTs, SimulCAT, a simulation tool, was utilized. The SimulCAT simulation involved creating 10,000 examinees with a normal distribution characterized by a mean of 0 and a standard deviation of 1. A pool of 500 test items was employed, and specific parameters were established for the item selection algorithm, CAT administration rules, item exposure control, and termination criteria. The termination criteria required a standard error of less than 0.35 to ensure accurate abilities estimation. The findings from the simulation study demonstrated that fixed-length tests provided higher testing precision without any systematic error, as indicated by measurement statistics like CBIAS, CMAE, and CRMSE. However, fixed-length tests exhibited a higher item exposure rate, which could be mitigated by selecting items with fewer dependencies on specific item parameters (a-parameters). On the other hand, variable-length tests demonstrated increased redundancy. Based on these results, CAT is recommended as an alternative approach for conducting NBTs due to its capability to accurately measure individual abilities and reduce the testing duration. For high-stakes assessments like the NBTs, fixed-length tests are preferred as they offer superior testing precision while minimizing item exposure rates.
- Published
- 2024
12. Multiple Imputation of Partially Observed Covariates in Discrete-Time Survival Analysis
- Author
-
Anna-Carolina Haensch, Jonathan Bartlett, and Bernd Weiß
- Abstract
Discrete-time survival analysis (DTSA) models are a popular way of modeling events in the social sciences. However, the analysis of discrete-time survival data is challenged by missing data in one or more covariates. Negative consequences of missing covariate data include efficiency losses and possible bias. A popular approach to circumventing these consequences is multiple imputation (MI). In MI, it is crucial to include outcome information in the imputation models. As there is little guidance on how to incorporate the observed outcome information into the imputation model of missing covariates in DTSA, we explore different existing approaches using fully conditional specification (FCS) MI and substantive-model compatible (SMC)-FCS MI. We extend SMC-FCS for DTSA and provide an implementation in the smcfcs R package. We compare the approaches using Monte Carlo simulations and demonstrate a good performance of the new approach compared to existing approaches.
- Published
- 2024
- Full Text
- View/download PDF
13. A Nonparametric Composite Group DIF Index for Focal Groups Stemming from Multicategorical Variables
- Author
-
Corinne Huggins-Manley, Anthony W. Raborn, Peggy K. Jones, and Ted Myers
- Abstract
The purpose of this study is to develop a nonparametric DIF method that (a) compares focal groups directly to the composite group that will be used to develop the reported test score scale, and (b) allows practitioners to explore for DIF related to focal groups stemming from multicategorical variables that constitute a small proportion of the overall testing population. We propose the nonparametric root expected proportion squared difference (REPSD) index that evaluates the statistical significance of composite group DIF for relatively small focal groups stemming from multicategorical focal variables, with decisions of statistical significance based on quasi-exact p values obtained from Monte Carlo permutations of the DIF statistic under the null distribution. We conduct a simulation to evaluate conditions under which the index produces acceptable Type I error and power rates, as well as an application to a school district assessment. Practitioners can calculate the "REPSD" index in a freely available package we created in the R environment.
- Published
- 2024
- Full Text
- View/download PDF
14. Addressing Uncodable Behaviors: A Bayesian Ordinal Mixture Model Applied to a Mathematics Learning Trajectory Teaching Experiment
- Author
-
Pavel Chernyavskiy, Traci S. Kutaka, Carson Keeter, Julie Sarama, and Douglas Clements
- Abstract
When researchers code behavior that is undetectable or falls outside of the validated ordinal scale, the resultant outcomes often suffer from informative missingness. Incorrect analysis of such data can lead to biased arguments around efficacy and effectiveness in the context of experimental and intervention research. Here, we detail a new Bayesian mixture approach that analyzes ordinal responses with undetectable/uncodable behaviors in two stages: (1) estimate a likelihood of response detection and (2) estimate an Explanatory Item Response Model for the ordinal variable conditional on detection. We present an independent random effects and correlated random effects variant of the new model and demonstrate evidence of model functionality using two simulation studies. To illustrate the utility of our proposed approach, we describe an extended application to data collected during a length measurement teaching experiment (N = 186, 56% girls, 5-6 years at preassessment). Results indicate that students assigned to a learning trajectories instructional condition were more likely to use detectable, mathematically relevant problem-solving strategies than their peers in two comparison conditions and that their problem-solving strategies were also more sophisticated. [This is the online first version of an article published in "Journal of Research on Educational Effectiveness."]
- Published
- 2024
- Full Text
- View/download PDF
15. An Improved Inferential Procedure to Evaluate Item Discriminations in a Conditional Maximum Likelihood Framework
- Author
-
Clemens Draxler, Andreas Kurz, Can Gürer, and Jan Philipp Nolte
- Abstract
A modified and improved inductive inferential approach to evaluate item discriminations in a conditional maximum likelihood and Rasch modeling framework is suggested. The new approach involves the derivation of four hypothesis tests. It implies a linear restriction of the assumed set of probability distributions in the classical approach that represents scenarios of different item discriminations in a straightforward and efficient manner. Its improvement is discussed, compared to classical procedures (tests and information criteria), and illustrated in Monte Carlo experiments as well as real data examples from educational research. The results show an improvement of power of the modified tests of up to 0.3.
- Published
- 2024
- Full Text
- View/download PDF
16. Alternatives to Weighted Item Fit Statistics for Establishing Measurement Invariance in Many Groups
- Author
-
Sean Joo, Montserrat Valdivia, Dubravka Svetina Valdivia, and Leslie Rutkowski
- Abstract
Evaluating scale comparability in international large-scale assessments depends on measurement invariance (MI). The root mean square deviation (RMSD) is a standard method for establishing MI in several programs, such as the Programme for International Student Assessment and the Programme for the International Assessment of Adult Competencies. Previous research showed that the RMSD was unable to detect departures from MI when the latent trait distribution was far from item difficulty. In this study, we developed three alternative approaches to the original RMSD: equal, item information, and b-norm weighted RMSDs. Specifically, we considered the item-centered normalized weight distributions to compute the item characteristic curve difference in the RMSD procedure more efficiently. We further compared all methods' performance via a simulation study and the item information and b-norm weighted RMSDs showed the most promising results. An empirical example is demonstrated, and implications for researchers are discussed.
- Published
- 2024
- Full Text
- View/download PDF
17. Extending an Identified Four-Parameter IRT Model: The Confirmatory Set-4PNO Model
- Author
-
Justin L. Kern
- Abstract
Given the frequent presence of slipping and guessing in item responses, models for the inclusion of their effects are highly important. Unfortunately, the most common model for their inclusion, the four-parameter item response theory model, potentially has severe deficiencies related to its possible unidentifiability. With this issue in mind, the dyad four-parameter normal ogive (Dyad-4PNO) model was developed. This model allows for slipping and guessing effects by including binary augmented variables--each indicated by two items whose probabilities are determined by slipping and guessing parameters--which are subsequently related to a continuous latent trait through a two-parameter model. Furthermore, the Dyad-4PNO assumes uncertainty as to which items are paired on each augmented variable. In this way, the model is inherently exploratory. In the current article, the new model, called the Set-4PNO model, is an extension of the Dyad-4PNO in two ways. First, the new model allows for more than two items per augmented variable. Second, these item sets are assumed to be fixed, that is, the model is confirmatory. This article discusses this extension and introduces a Gibbs sampling algorithm to estimate the model. A Monte Carlo simulation study shows the efficacy of the algorithm at estimating the model parameters. A real data example shows that this extension may be viable in practice, with the data fitting a more general Set-4PNO model (i.e., more than two items per augmented variable) better than the Dyad-4PNO, 2PNO, 3PNO, and 4PNO models.
- Published
- 2024
- Full Text
- View/download PDF
18. An Evaluation of Fit Indices Used in Model Selection of Dichotomous Mixture IRT Models
- Author
-
Sedat Sen and Allan S. Cohen
- Abstract
A Monte Carlo simulation study was conducted to compare fit indices used for detecting the correct latent class in three dichotomous mixture item response theory (IRT) models. Ten indices were considered: Akaike's information criterion (AIC), the corrected AIC (AICc), Bayesian information criterion (BIC), consistent AIC (CAIC), Draper's information criterion (DIC), sample size adjusted BIC (SABIC), relative entropy, the integrated classification likelihood criterion (ICL-BIC), the adjusted Lo-Mendell-Rubin (LMR), and Vuong-Lo-Mendell-Rubin (VLMR). The accuracy of the fit indices was assessed for correct detection of the number of latent classes for different simulation conditions including sample size (2,500 and 5,000), test length (15, 30, and 45), mixture proportions (equal and unequal), number of latent classes (2, 3, and 4), and latent class separation (no-separation and small separation). Simulation study results indicated that as the number of examinees or number of items increased, correct identification rates also increased for most of the indices. Correct identification rates by the different fit indices, however, decreased as the number of estimated latent classes or parameters (i.e., model complexity) increased. Results were good for BIC, CAIC, DIC, SABIC, ICL-BIC, LMR, and VLMR, and the relative entropy index tended to select correct models most of the time. Consistent with previous studies, AIC and AICc showed poor performance. Most of these indices had limited utility for three-class and four-class mixture 3PL model conditions.
- Published
- 2024
- Full Text
- View/download PDF
19. An Item Response Theory Model for Incorporating Response Times in Forced-Choice Measures
- Author
-
Zhichen Guo, Daxun Wang, Yan Cai, and Dongbo Tu
- Abstract
Forced-choice (FC) measures have been widely used in many personality or attitude tests as an alternative to rating scales, which employ comparative rather than absolute judgments. Several response biases, such as social desirability, response styles, and acquiescence bias, can be reduced effectively. Another type of data linked with comparative judgments is response time (RT), which contains potential information concerning respondents' decision-making process. It would be challenging but exciting to combine RT into FC measures better to reveal respondents' behaviors or preferences in personality measurement. Given this situation, this study aims to propose a new item response theory (IRT) model that incorporates RT into FC measures to improve personality assessment. Simulation studies show that the proposed model can effectively improve the estimation accuracy of personality traits with the ancillary information contained in RT. Also, an application on a real data set reveals that the proposed model estimates similar but different parameter values compared with the conventional Thurstonian IRT model. The RT information can explain these differences.
- Published
- 2024
- Full Text
- View/download PDF
20. The Power and Type I Error of Wilcoxon-Mann-Whitney, Welch's 't,' and Student's 't' Tests for Likert-Type Data
- Author
-
Simsek, Ahmet Salih
- Abstract
Likert-type item is the most popular response format for collecting data in social, educational, and psychological studies through scales or questionnaires. However, there is no consensus on whether parametric or non-parametric tests should be preferred when analyzing Likert-type data. This study examined the statistical power of parametric and non-parametric tests when each Likert-type item was analyzed independently in survey studies. The main purpose of the study is to examine the statistical power of Wilcoxon-Mann-Whitney, Welch's t, and Student's t tests for Likert-type data, which are pairwise comparison tests. For this purpose, a Monte Carlo simulation study was conducted. The statistical significance of the selected tests was examined under the conditions of sample size, group size ratio, and effect size. The results showed that the Wilcoxon-Mann-Whitney test was superior to its counterparts, especially for small samples and unequal group sizes. However, the Student's t-test for Likert-type data had similar statistical power to the Wilcoxon-Mann-Whitney test under conditions of equal group sizes when the sample size was 200 or more. Consistent with the empirical results, practical recommendations were provided for researchers on what to consider when collecting and analyzing Likert-type data.
- Published
- 2023
21. A Comparison of the Efficacies of Differential Item Functioning Detection Methods
- Author
-
Basman, Munevver
- Abstract
To ensure the validity of the tests is to check that all items have similar results across different groups of individuals. However, differential item functioning (DIF) occurs when the results of individuals with equal ability levels from different groups differ from each other on the same test item. Based on Item Response Theory and Classic Test Theory, there are some methods, with different advantages and limitations to identify items that show DIF. This study aims to compare the performances of five methods for detecting DIF. The efficacies of Mantel-Haenszel (MH), Logistic Regression (LR), Crossing simultaneous item bias test (CSIBTEST), Lord's chi-square (LORD), and Raju's area measure (RAJU) methods are examined considering conditions of the sample size, DIF ratio, and test length. In this study, to compare the detection methods, power and Type I error rates are evaluated using a simulation study with 100 replications conducted for each condition. Results show that LR and MH have the lowest Type I error and the highest power rate in detecting uniform DIF. In addition, CSIBTEST has a similar power rate to MH and LR. Under DIF conditions, sample size, DIF ratio, test length and their interactions affect Type I error and power rates.
- Published
- 2023
22. Comparison of Cronbach's Alpha and McDonald's Omega for Ordinal Data: Are They Different?
- Author
-
Fatih Orcan
- Abstract
Among all, Cronbach's Alpha and McDonald's Omega are commonly used for reliability estimations. The alpha uses inter-item correlations while omega is based on a factor analysis result. This study uses simulated ordinal data sets to test whether the alpha and omega produce different estimates. Their performances were compared according to the sample size, number of items, and deviance from tau equivalence. Based on the result, the alpha and omega had similar results, except for the small sample size, the smaller number of items, and the low factor loading values. When there were 5 or more items in the scale and factor analysis which the omega was calculated from showed fit to the data set, using omega over alpha could be preferred. Also, as the number of items exceeds 5, the alpha and omega differences disappear. Since calculating the alpha is easier compared to the omega (omega requires fitting a factor model first) using alpha over omega can also be suggested. However, when the number of items and the correlations among the items were small, omega performed worse than alpha. Therefore, alpha should be used for the reliability estimations.
- Published
- 2023
23. Changing the Success Probability in Computerized Adaptive Testing: A Monte-Carlo Simultion on the Open Matrices Item Bank
- Author
-
Hanif Akhtar
- Abstract
For efficiency, Computerized Adaptive Test (CAT) algorithm selects items with the maximum information, typically with a 50% probability of being answered correctly. However, examinees may not be satisfied if they only correctly answer 50% of the items. Researchers discovered that changing the item selection algorithms to choose easier items (i.e., success probability > 50%), albeit not optimum from a measurement efficiency standpoint, would provide a better experience. The current study aims to investigate the impact of changing the success probability on measurement efficiency. A Monte-Carlo simulation was performed on the Open Matrices Item Bank and simulated item bank. A total of 1500 examinees were generated. We modified the item selection algorithm with the expected success probability of 60%, 70%, and 80%. Each examinee was assigned to five item selection methods: maximum-information, random, p=0.6, p=0.7, and p=0.8. The results indicated that traditional CAT was 60-70% shorter than random item selection. Altering the success probability did not affect the estimation of the examinee's ability. Increasing the probability of success in CAT increased the number of items required to achieve specified levels of precision. Practical considerations on how to maximize the trade-off between examinees' experiences and measurement efficiency are mentioned in the discussion. [For the full proceedings, see ED654100.]
- Published
- 2023
24. Effect of Item Parameter Drift in Mixed Format Common Items on Test Equating
- Author
-
Uysal, Ibrahim, Sahin-Kürsad, Merve, and Kiliç, Abdullah Faruk
- Abstract
The aim of the study was to examine the common items in the mixed format (e.g., multiple-choices and essay items) contain parameter drifts in the test equating processes performed with the common item nonequivalent groups design. In this study, which was carried out using Monte Carlo simulation with a fully crossed design, the factors of test length (30 and 50), sample size (1000 and 3000), common item ratio (30 and 40%), ratio of items with item parameter drift (IPD) in common items (20 and 30%), location of common items in tests (at the beginning, randomly distributed, and at the end) and IPD size in multiple-choice items (low [0.2] and high [1.0]) were studied. Four test forms were created, and two test forms do not contain parameter drifts. After the parameter drift was performed on the first of the other two test forms, the parameter drift was again performed on the second test form. Test equating results were compared using the root mean squared error (RMSE) value. As a result of the research, ratio of items with IPD in common items, IPD size in multiple-choice items, common item ratio, sample size and test length on equating errors were found to be significant.
- Published
- 2022
25. Incorporating Complex Sampling Weights in Multilevel Analyses of Education Data
- Author
-
Shen, Ting and Konstantopoulos, Spyros
- Abstract
Large-scale assessment survey (LSAS) data are collected via complex sampling designs with special features (e.g., clustering and unequal probability of selection). Multilevel models have been utilized to account for clustering effects whereas the probability weighting approach (PWA) has been used to deal with design informativeness derived from the unequal probability selection. However, the difficulty of applying PWA in multilevel models (MLM) has been generally underestimated and practical guidance is scarce. This study utilizes an empirical as well as a Monte Carlo simulation investigation to examine the performance of the multilevel pseudo maximum likelihood (MPML) estimation based on information derived from the Early Childhood Longitudinal Study Kindergarten cohort of 2010-2011 (ECLS-K:2011). Variance components and fixed effects estimators across four estimation methods including three MPML estimators (i.e., weighted without scaling, weighted size-scaled and weighted effective-scaled) and the unweighted estimator are provided. Practical guidance about the use of sampling weights in MLM analyses of LSAS data is also offered.
- Published
- 2022
26. Dynamic Structural Equation Models with Missing Data: Data Requirements on 'N' and 'T'
- Author
-
Yuan Fang and Lijuan Wang
- Abstract
Dynamic structural equation modeling (DSEM) is a useful technique for analyzing intensive longitudinal data. A challenge of applying DSEM is the missing data problem. The impact of missing data on DSEM, especially on widely applied DSEM such as the two-level vector autoregressive (VAR) cross-lagged models, however, is understudied. To fill the research gap, we evaluated how well the fixed effects and variance parameters in two-level bivariate VAR models are recovered under different missingness percentages, sample sizes, the number of time points, and heterogeneity in missingness distributions through two simulation studies. To facilitate the use of DSEM under customized data and model scenarios (different from those in our simulations), we provided illustrative examples of how to conduct Monte Carlo simulations in Mplus to determine whether a data configuration is sufficient to obtain accurate and precise results from a specific DSEM. [This is the online version of an article published in "Structural Equation Modeling: A Multidisciplinary Journal."]
- Published
- 2024
- Full Text
- View/download PDF
27. Statistical Power Analysis and Sample Size Planning for Moderated Mediation Models
- Author
-
Ziqian Xu, Fei Gao, Anqi Fa, Wen Qu, and Zhiyong Zhang
- Abstract
Conditional process models, including moderated mediation models and mediated moderation models, are widely used in behavioral science research. However, few studies have examined approaches to conduct statistical power analysis for such models and there is also a lack of software packages that provide such power analysis functionalities. In this paper, we introduce new simulation-based methods for power analysis of conditional process models with a focus on moderated mediation models. These simulation-based methods provide intuitive ways for sample-size planning based on regression coefficients in a moderated mediation model as well as selected variance and covariance components. We demonstrate how the methods can be applied to five commonly used moderated mediation models using a simulation study, and we also assess the performance of the methods through the five models. We implement our approaches in the WebPower R package and also in Web apps to ease their application. [This is the online version of an article published in "Behavior Research Methods."]
- Published
- 2024
- Full Text
- View/download PDF
28. Applications of Planned Missing Data Designs in Latent Change Score Models
- Author
-
Ayse Busra Ceviren
- Abstract
Latent change score (LCS) models are a powerful class of structural equation modeling that allows researchers to work with latent difference scores that minimize measurement error. LCS models define change as a function of prior status, which makes it well-suited for modeling developmental theories or processes. In LCS models, like other latent models, latent constructs require multiple observed variables, which may require a considerable investment in both time and financial resources. Planned missing data designs offer a valuable opportunity to make research more efficient in terms of time and cost with little or no loss of statistical power. Drawing on the benefits of planned missing data designs and the strengths of LCS models, I examined the planned missing data designs in the context of LCS models. Utilizing real-life data, different missing data designs and sample size conditions were tested on univariate LCS models with three indicators measured at two time points. An external Monte Carlo simulation study was conducted in each sample size condition (200, 500, and 1,000), in which 500 replications were generated for each planned missing data design. I assessed the convergence rate of LCS models, the relative bias of parameter estimates in LCS models, and the relative efficiency of parameter estimates in LCS models to determine if planned missing data designs have advantages over complete data designs. The convergence rates were 100% in all nine patterns and three sample sizes with no improper solutions. The results showed that the relative bias and relative efficiency in parameter estimates varied across planned missing patterns and sample size conditions. In patterns where planned missingness was set on only one indicator, 20% of missing data did not result in biased parameter estimates. However, setting higher levels of missing data (50% or 66%) on a single indicator resulted in bias in parameter estimates across sample sizes. Results on imposing planned missingness on one vs. two indicators indicated that when 93% of the complete data is available, the number of indicators with planned missing data did not make a difference in bias estimates. However, with only 83% or 78% of the complete data, only distributing the planned missingness evenly across two indicators (10%, 25%, or 33% planned missingness on each) at both time points resulted in accurate parameter estimates. Relative efficiency estimates suggested imposing the same pattern of planned missingness across time points. When the missingness pattern did not stay constant across time points, none of the models resulted in efficient parameter estimates. [The dissertation citations contained here are published with the permission of ProQuest LLC. Further reproduction is prohibited without permission. Copies of dissertations may be obtained by Telephone (800) 1-800-521-0600. Web page: http://www.proquest.com/en-US/products/dissertations/individuals.shtml.]
- Published
- 2024
29. IRT Characteristic Curve Linking Methods Weighted by Information for Mixed-Format Tests
- Author
-
Shaojie Wang, Won-Chan Lee, Minqiang Zhang, and Lixin Yuan
- Abstract
To reduce the impact of parameter estimation errors on IRT linking results, recent work introduced two information-weighted characteristic curve methods for dichotomous items. These two methods showed outstanding performance in both simulation and pseudo-form pseudo-group analysis. The current study expands upon the concept of information weighting in the context of linking mixed-format tests. Three new linking methods were proposed, including category-information-weighted characteristic curve (CWCC), item-information-weighted characteristic curve (IWCC), and test-information-weighted characteristic curve (TWCC) methods. Both a simulation study and a pseudo-form pseudo-group analysis were conducted to evaluate their relative performances under the non-equivalent groups with anchor test design. In general, IWCC and TWCC outperformed their respective counterparts, whereas the advantage of CWCC was not readily apparent. Among the three new methods, IWCC and TWCC showed better performance. Practical recommendations and future directions are discussed.
- Published
- 2024
- Full Text
- View/download PDF
30. Quantifying and Estimating Regression to the Mean Effect for Bivariate Beta-Binomial Distribution
- Author
-
Aimel Zafar, Manzoor Khan, and Muhammad Yousaf
- Abstract
Subjects with initially extreme observations upon remeasurement are found closer to the population mean. This tendency of observations toward the mean is called regression to the mean (RTM) and can make natural variation in repeated data look like real change. Studies, where subjects are selected on a baseline criterion, should be guarded against the RTM effect to avoid erroneous conclusions. In an intervention study, the difference between pre-post variables is the combined effect of intervention/treatment and RTM. Thus, accounting for RTM is essential to accurately estimate the intervention effect. Many real-life examples are better modeled by a bivariate binomial model with varying probability of success. In this article, a bivariate beta-binomial distribution is used that allows the probability of success to vary from subject to subject. Expressions for the total, RTM, and treatment effect are derived, and their behavior is demonstrated graphically. Maximum likelihood estimators of RTM are derived, and their statistical properties are studied via Monte Carlo simulation. The proposed techniques are employed to estimate the RTM effect by utilizing data related to the Countway WM-class circulation.
- Published
- 2024
- Full Text
- View/download PDF
31. Comparison of Statistical Models for Individual's Ability Index and Ranking
- Author
-
Javed Iqbal and Tanweer Ul Islam
- Abstract
Economic efficiency demands accurate assessment of individual ability for selection purposes. This study investigates Classical Test Theory (CTT) and Item Response Theory (IRT) for estimating true ability and ranking individuals. Two Monte Carlo simulations and real data analyses were conducted. Results suggest a slight advantage for IRT, but ability estimates from both methods were highly correlated (r=0.95), indicating similar outcomes. The Logistic two-parameter IRT model emerged as the most reliable and rigorous approach.
- Published
- 2024
- Full Text
- View/download PDF
32. Uncertainty in Artificial Neural Network Models: Monte-Carlo Simulations beyond the GUM Boundaries
- Author
-
A. M. Sadek and Fahad Al-Muhlaki
- Abstract
In this study, the accuracy of the artificial neural network (ANN) was assessed considering the uncertainties associated with the randomness of the data and the lack of learning. The Monte-Carlo algorithm was applied to simulate the randomness of the input variables and evaluate the output distribution. It has been shown that under certain conditions, the GUM framework for uncertainty evaluation may completely fail. The ANN modeling technique can be used as an alternative method for estimating the expectation value and evaluating the associated uncertainty. Furthermore, unlike the GUM and Monte-Carlo frameworks, the ANN models do not require mathematical expressions between the input and output variables. On the other hand, owing to the uncertainty associated with the lack of learning, the ANN model may produce unrealistic results, even if a global minimum is approached. This behavior is explained by Bayesian theory which assumes that the output values generated by various runs are normally distributed at each target. This may lead to an unrealistic output when the overall distribution of the target values has a different distribution than that presumed by Bayesian theory. To minimize this drawback, the ANN model output should be calculated from sufficiently large repeated runs with new starting values of the weights and biases.
- Published
- 2024
- Full Text
- View/download PDF
33. A New Sampling Scheme for an Improved Monitoring of the Process Mean
- Author
-
Abdul Haq
- Abstract
This article introduces an innovative sampling scheme, the median sampling (MS), utilizing individual observations over time to efficiently estimate the mean of a process characterized by a symmetric (non-uniform) probability distribution. The mean estimator based on MS is not only unbiased but also boasts enhanced precision compared to its simple random sampling-based counterpart. Moreover, a new EWMA chart based on the mean estimator within the MS scheme is proposed. The performance of the EWMA charts, derived from both simple random sampling (SRS) and MS schemes, is evaluated using the metrics of steady-state average run-length and average number of items-to-signal. The findings underscore the superiority of the EWMA-MS chart over the EWMA-SRS chart. Additionally, as the magnitude of ranking errors escalates, the behavior of the EWMA-MS chart converges toward that of the EWMA-SRS chart. The practical implementation of the newly introduced EWMA chart is demonstrated through an illustrative example.
- Published
- 2024
- Full Text
- View/download PDF
34. Experimental Design and Power for Moderation in Multisite Cluster Randomized Trials
- Author
-
Nianbo Dong, Benjamin Kelcey, and Jessaca Spybrook
- Abstract
Multisite cluster randomized trials (MCRTs), in which, the intermediate-level clusters (e.g., classrooms) are randomly assigned to the treatment or control condition within each site (e.g., school), are among the most commonly used experimental designs across a broad range of disciplines. MCRTs often align with the theory that programs are delivered at a cluster-level (e.g., teacher professional development) and provide opportunities to explore treatment effect heterogeneity across sites. In designing experimental studies, a critical step is the statistical power analysis and sample size determination. However, the statistical tools for power analysis of moderator effects in three-level MCRTs are not available. In this study, we derived formulas for calculating the statistical power and the minimum detectable effect size difference (MDESD) with confidence intervals for investigating the effects of various moderators in three-level MCRTs. We considered the levels of the moderators (level-1, -2, and -3), the scales of the moderators (binary and continuous), and random and nonrandomly varying slopes of the (moderated) treatment effects. We validated our formulas through Monte Carlo simulations. Finally, we conclude with directions for future work.
- Published
- 2024
- Full Text
- View/download PDF
35. Individual Participant Data Meta-Analysis Including Moderators: Empirical Validation
- Author
-
Mariola Moeyaert, Panpan Yang, and Yukang Xue
- Abstract
We have entered an era in which scientific evidence increasingly informs research practice and policy. As there is an exponential increase in the use of single-case experimental designs (SCEDs) to evaluate intervention effectiveness, there is accumulating evidence available for quantitative synthesis. Consequently, there is a growing interest in techniques suitable to meta-analyze SCED research. One technique that can be applied is individual patient data (IPD) meta-analysis. IPD is a flexible approach, allowing for a variety of modeling options such as modeling moderators to explain intervention heterogeneity. To date, no methodological research has been conducted to evaluate the statistical properties of effect estimates obtained by using IPD meta-analysis with the inclusion of moderators. This study is designed to address this by conducting a large-scale Monte Carlo study. Based on the results, specific recommendations are provided to indicate under which conditions the IPD meta-analysis including moderators is suitable.
- Published
- 2024
- Full Text
- View/download PDF
36. The Impact of Omitting Confounders in Parallel Process Latent Growth Curve Mediation Models: Three Sensitivity Analysis Approaches
- Author
-
Xiao Liu, Zhiyong Zhang, Kristin Valentino, and Lijuan Wang
- Abstract
Parallel process latent growth curve mediation models (PP-LGCMMs) are frequently used to longitudinally investigate the mediation effects of treatment on the level and change of outcome through the level and change of mediator. An important but often violated assumption in empirical PP-LGCMM analysis is the absence of omitted confounders of the relationships among treatment, mediator, and outcome. In this study, we analytically examined how omitting pretreatment confounders impacts the inference of mediation from the PP-LGCMM. Using the analytical results, we developed three sensitivity analysis approaches for the PP-LGCMM, including the frequentist, Bayesian, and Monte Carlo approaches. The three approaches help investigate different questions regarding the robustness of mediation results from the PP-LGCMM, and handle the uncertainty in the sensitivity parameters differently. Applications of the three sensitivity analyses are illustrated using a real-data example. A user-friendly Shiny web application is developed to conduct the sensitivity analyses.
- Published
- 2024
- Full Text
- View/download PDF
37. Cognitive Diagnosis Testlet Model for Multiple-Choice Items
- Author
-
Lei Guo, Wenjie Zhou, and Xiao Li
- Abstract
The testlet design is very popular in educational and psychological assessments. This article proposes a new cognitive diagnosis model, the multiple-choice cognitive diagnostic testlet (MC-CDT) model for tests using testlets consisting of MC items. The MC-CDT model uses the original examinees' responses to MC items instead of dichotomously scored data (i.e., correct or incorrect) to retain information of different distractors and thus enhance the MC items' diagnostic power. The Markov chain Monte Carlo algorithm was adopted to calibrate the model using the WinBUGS software. Then, a thorough simulation study was conducted to evaluate the estimation accuracy for both item and examinee parameters in the MC-CDT model under various conditions. The results showed that the proposed MC-CDT model outperformed the traditional MC cognitive diagnostic model. Specifically, the MC-CDT model fits the testlet data better than the traditional model, while also fitting the data without testlets well. The findings of this empirical study show that the MC-CDT model fits real data better than the traditional model and that it can also provide testlet information.
- Published
- 2024
- Full Text
- View/download PDF
38. Power to Detect Moderated Effects in Studies with Three-Level Partially Nested Data
- Author
-
Kyle Cox, Ben Kelcey, and Hannah Luce
- Abstract
Comprehensive evaluation of treatment effects is aided by considerations for moderated effects. In educational research, the combination of natural hierarchical structures and prevalence of group-administered or shared facilitator treatments often produces three-level partially nested data structures. Literature details planning strategies for a variety of experimental designs when moderation effects are of interest but has yet to establish power formulas for detecting moderation effects in three-level partially nested designs. To address this gap, we derive and assess the accuracy of power formulas for detecting the different types of moderation effects possible in these designs. Using Monte Carlo simulation studies, we probe power rates and adequate sample sizes for detecting the different moderation effects while varying common influential factors including variance in the outcome explained by covariates, magnitude of the moderation effect, and sample sizes. The power formulas developed improve the planning of experimental studies with partial nesting and encourage the inclusion of moderator variables to capture for whom and under what conditions a treatment is effective. Educational researchers also have some initial guidance regarding adequate sample sizes and the factors that influence detecting moderation effects in three-level partially nested designs.
- Published
- 2024
- Full Text
- View/download PDF
39. Meta-Analysis and Partial Correlation Coefficients: A Matter of Weights
- Author
-
Sanghyun Hong and W. Robert Re
- Abstract
This study builds on the simulation framework of a recent paper by Stanley and Doucouliagos ("Research Synthesis Methods" 2023;14;515--519). S&D use simulations to make the argument that meta-analyses using partial correlation coefficients (PCCs) should employ a "suboptimal" estimator of the PCC standard error when constructing weights for fixed effect and random effects estimation. We address concerns that their simulations and subsequent recommendation may give meta-analysts a misleading impression. While the estimator they promote dominates the "correct" formula in their Monte Carlo framework, there are other estimators that perform even better. We conclude that more research is needed before best practice recommendations can be made for meta-analyses with PCCs.
- Published
- 2024
- Full Text
- View/download PDF
40. Using Response Times for Joint Modeling of Careless Responding and Attentive Response Styles
- Author
-
Esther Ulitzsch, Steffi Pohl, Lale Khorramdel, Ulf Kroehne, and Matthias von Davier
- Abstract
Questionnaires are by far the most common tool for measuring noncognitive constructs in psychology and educational sciences. Response bias may pose an additional source of variation between respondents that threatens validity of conclusions drawn from questionnaire data. We present a mixture modeling approach that leverages response time data from computer-administered questionnaires for the joint identification and modeling of two commonly encountered response bias that, so far, have only been modeled separately--careless and insufficient effort responding and response styles (RS) in attentive answering. Using empirical data from the Programme for International Student Assessment 2015 background questionnaire and the case of extreme RS as an example, we illustrate how the proposed approach supports gaining a more nuanced understanding of response behavior as well as how neglecting either type of response bias may impact conclusions on respondents' content trait levels as well as on their displayed response behavior. We further contrast the proposed approach against a more heuristic two-step procedure that first eliminates presumed careless respondents from the data and subsequently applies model-based approaches accommodating RS. To investigate the trustworthiness of results obtained in the empirical application, we conduct a parameter recovery study.
- Published
- 2024
- Full Text
- View/download PDF
41. Leveraging Item Parameter Drift to Assess Transfer Effects in Vocabulary Learning
- Author
-
Joshua B. Gilbert, James S. Kim, and Luke W. Miratrix
- Abstract
Longitudinal models typically emphasize between-person predictors of change but ignore how growth varies "within" persons because each person contributes only one data point at each time. In contrast, modeling growth with multi-item assessments allows evaluation of how relative item performance may shift over time. While traditionally viewed as a nuisance under the label of "item parameter drift" (IPD), IPD may be of substantive interest if it reflects how learning manifests on different items or subscales at different rates. In this study, we apply the Explanatory Item Response Model to assess IPD in a causal inference context. Simulation results show that when IPD is not accounted for, both parameter estimates and standard errors can be affected. We illustrate with an empirical application to the persistence of transfer effects from a content literacy intervention , revealing how researchers can leverage IPD to achieve a more fine-grained understanding of how vocabulary learning develops over time.
- Published
- 2024
- Full Text
- View/download PDF
42. Minimal-Effect Testing, Equivalence Testing, and the Conventional Null Hypothesis Testing for the Analysis of Bi-Factor Models
- Author
-
Shunji Wang, Katerina M. Marcoulides, Jiashan Tang, and Ke-Hai Yuan
- Abstract
A necessary step in applying bi-factor models is to evaluate the need for domain factors with a general factor in place. The conventional null hypothesis testing (NHT) was commonly used for such a purpose. However, the conventional NHT meets challenges when the domain loadings are weak or the sample size is insufficient. This article proposes using minimal-effect testing (MET) and equivalence testing (ET) to analyze bi-factor models. A key element in conducting MET and ET is the minimal size of factor loadings that can be regarded as noteworthy in practice, termed as "minimal noteworthy size." This article presents two approaches to formulating the minimal noteworthy size and compares the pros and cons of MET, ET, and the conventional NHT. Analysis shows that MET, ET, and the conventional NHT are complementary. Combining them to test the noteworthiness of domain loadings can help researchers make a comprehensive judgment. Real and simulated datasets illustrate the applications of the three methods. Monte Carlo results show that MET and ET can control type I errors reasonably well while maintaining good statistical power.
- Published
- 2024
- Full Text
- View/download PDF
43. To Be Long or to Be Wide: How Data Format Influences Convergence and Estimation Accuracy in Multilevel Structural Equation Modeling
- Author
-
Julia-Kim Walther, Martin Hecht, Benjamin Nagengast, and Steffen Zitzmann
- Abstract
A two-level data set can be structured in either long format (LF) or wide format (WF), and both have corresponding SEM approaches for estimating multilevel models. Intuitively, one might expect these approaches to perform similarly. However, the two data formats yield data matrices with different numbers of columns and rows, and their "cols : rows" is related to the magnitude of eigenvalue bias in sample covariance matrices. Previous studies have shown similar performance for both approaches, but they were limited to settings where "cols << rows" in both data formats. We conducted a Monte Carlo study to investigate whether varying "cols : rows" result in differing performances. Specifically, we examined the p:N ("cols : rows") effect on convergence and estimation accuracy in multilevel settings. Our findings suggest that (1) the LF approach is more likely to achieve convergence, but for the models that converged in both; (2) the LF and WF approach yield similar estimation accuracy; which is related to (3) differential "cols : rows" effects in both approaches; and (4) smaller ICC values lead to less accurate between-group parameter estimates.
- Published
- 2024
- Full Text
- View/download PDF
44. Bayesian Structural Equation Models of Correlation Matrices
- Author
-
James Ohisei Uanhoro
- Abstract
We present a method for Bayesian structural equation modeling of sample correlation matrices as correlation structures. The method transforms the sample correlation matrix to an unbounded vector using the matrix logarithm function. Bayesian inference about the unbounded vector is performed assuming a multivariate-normal likelihood, with a mean based on the transformed model-implied correlation matrix, and a covariance assumed to be of known form. Using Monte Carlo simulation, we examine the performance of the method with normal and ordinal indicators, as well as the capacity of the method to estimate models that account for misspecification. The performance of the approach is often adequate suggesting that the proposed method can be used for Bayesian analysis of correlation structures. We conclude with a discussion of potential applications of the approach, as well as future directions needed to further develop the method.
- Published
- 2024
- Full Text
- View/download PDF
45. Latent Profile Transition Analysis with Random Intercepts (RI-LPTA)
- Author
-
Ming-Chi Tseng
- Abstract
The primary objective of this investigation is the formulation of random intercept latent profile transition analysis (RI-LPTA). Our simulation investigation suggests that the election between LPTA and RI-LPTA for examination has negligible impact on the estimation of transition probability parameters when the population parameters are generated based on the LPTA model. However, an estimation bias in the transition probabilities is observed in all simulated conditions when LPTA is used for analysis and the generated population parameters follow the RI-LPTA model. Researchers constructing empirical longitudinal models are advised to prioritize the use of RI-LPTA in model construction, regardless of whether LPTA exhibits random intercept effect. This measure helps to reduce the possible bias in the estimation parameters which may occur if the random intercept effect of LPTA is not taken into account during model specification.
- Published
- 2024
- Full Text
- View/download PDF
46. Recovering Developmental Bivariate Trajectories in Accelerated Longitudinal Designs with Dynamic Continuous Time Modeling
- Author
-
Nuria Real-Brioso, Eduardo Estrada, and Pablo F. Cáncer
- Abstract
Accelerated longitudinal designs (ALDs) provide an opportunity to capture long developmental periods in a shorter time framework using a relatively small number of assessments. Prior literature has investigated whether univariate developmental processes can be characterized with data obtained from ALDs. However, many important questions in psychology and related sciences imply working with several variables that are intercorrelated as they unfold over time, such as cognitive and cortical development. Therefore, bivariate developmental models are required. This study aimed to assess the effectiveness of continuous-time bivariate Latent Change Score (CT-BLCS) models for recovering the trajectories of two interdependent developmental processes using data from diverse ALDs. Through a Monte Carlo simulation study, the efficacy of different sampling designs and sample sizes was examined. The study fills a gap in the literature by examining the performance of ALDs in bivariate systems, providing specific recommendations for future application of ALDs for studying interrelated developmental variables.
- Published
- 2024
- Full Text
- View/download PDF
47. Performance of Model Fit and Selection Indices for Bayesian Piecewise Growth Modeling with Missing Data
- Author
-
Ihnwhi Heo, Fan Jia, and Sarah Depaoli
- Abstract
The Bayesian piecewise growth model (PGM) is a useful class of models for analyzing nonlinear change processes that consist of distinct growth phases. In applications of Bayesian PGMs, it is important to accurately capture growth trajectories and carefully consider knot placements. The presence of missing data is another challenge researchers commonly encounter. To address these issues, one could use model fit and selection indices to detect misspecified Bayesian PGMs, and should give care to the potential impact of missing data on model evaluation. Here we conducted a simulation study to examine the impact of model misspecification and missing data on the performance of Bayesian model fit and selection indices (PPP-value, BCFI, BTLI, BRMSEA, BIC, and DIC), with an additional focus on prior sensitivity. Results indicated that (a) increasing the degree of model misspecification and amount of missing data aggravated the performance of indices in detecting misfit, and (b) different prior specifications had negligible impact on model assessment. We provide practical guidelines for researchers to facilitate effective implementation of Bayesian PGMs.
- Published
- 2024
- Full Text
- View/download PDF
48. Comparing Mimic and Mimic-Interaction to Alignment Methods for Investigating Measurement Invariance Concerning a Continuous Violator
- Author
-
Yuanfang Liu, Mark H. C. Lai, and Ben Kelcey
- Abstract
Measurement invariance holds when a latent construct is measured in the same way across different levels of background variables (continuous or categorical) while controlling for the true value of that construct. Using Monte Carlo simulation, this paper compares the multiple indicators, multiple causes (MIMIC) model and MIMIC-interaction to a novel use of alignment optimization (AO) for detecting measurement noninvariance when the violator is a continuous variable. Results showed that MIMIC and MIMIC-interaction in sequential likelihood ratio tests and Wald tests with a Bonferroni correction provided a good balance between identifying invariant and noninvariant (linear violations) items when n=500 in terms of classification accuracy (CA). AO (CA = 0.86) was as competitive as MIMIC and MIMIC-interaction to linear invariance violations but was far better under nonlinear quadratic violations when n= 1,000 (i.e., 100 per group for 10 groups).
- Published
- 2024
- Full Text
- View/download PDF
49. Deep Learning Generalized Structured Component Analysis: An Interpretable Artificial Neural Network Model with Composite Indexes
- Author
-
Gyeongcheol Cho and Heungsun Hwang
- Abstract
Generalized structured component analysis (GSCA) is a multivariate method for specifying and examining interrelationships between observed variables and components. Despite its data-analytic flexibility honed over the decade, GSCA always defines every component as a linear function of observed variables, which can be less optimal when observed variables for a component are nonlinearly related, often reducing the component's predictive power. To address this issue, we combine deep learning and GSCA into a single framework to allow a component to be a nonlinear function of observed variables without specifying the exact functional form in advance. This new method, termed deep learning generalized structured component analysis (DL-GSCA), aims to maximize the predictive power of components while their directed or undirected network remains interpretable. Our real and simulated data analyses show that DL-GSCA produces components with greater predictive power than those from GSCA in the presence of nonlinear associations between observed variables per component.
- Published
- 2024
- Full Text
- View/download PDF
50. Revisiting Savalei's (2011) Research on Remediating Zero-Frequency Cells in Estimating Polychoric Correlations: A Data Distribution Perspective
- Author
-
Tong-Rong Yang and Li-Jen Weng
- Abstract
In Savalei's (2011) simulation that evaluated the performance of polychoric correlation estimates in small samples, two methods for treating zero-frequency cells, adding 0.5 (ADD) and doing nothing (NONE), were compared. Savalei tentatively suggested using ADD for binary data and NONE for data with three or more categories. Yet, Savalei's suggestion could be explained by the skewness of the data distribution being severe for binary data and slight for three-category data. To rule out this alternative explanation, we extended Savalei's design by incorporating the degree of skewness into our simulation. With slightly skewed data, NONE is recommended due to its high-quality estimates. With severely skewed data, only ADD is recommended for binary data when the skewness of two variables is the same-signed and the underlying correlation is expected to be strong. Methods for improving the polychoric correlation estimates with severely skewed data merit further study.
- Published
- 2024
- Full Text
- View/download PDF
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.