49,219 results on '"Sample size"'
Search Results
2. Learning Analytics: A Comparison of Western, Educated, Industrialized, Rich, and Democratic (WEIRD) and Non-WEIRD Research
- Author
-
Clare Baek and Tenzin Doleck
- Abstract
We examined how Learning Analytics literature represents participants from diverse societies by comparing the studies published with samples from WEIRD (Western, Industrialized, Rich, Democratic) nations versus non-WEIRD nations. By analyzing the Learning Analytics studies published during 2015-2019 (N = 360), we found that most of the studies were on WEIRD samples, with at least 58 percent of the total studies on WEIRD samples. Through keyword analysis, we found that the studies on WEIRD samples' research topics focused on self-regulated learning and feedback received in learning environments. The studies on non-WEIRD samples focused on the collaborative and social nature of learning. Our investigation of the analysis tools used for the studies suggested the limitations of some software in analyzing languages in diverse countries. Our analysis of theoretical frameworks revealed that most studies on both WEIRD and non-WEIRD samples did not identify a theoretical framework. The studies on WEIRD and non-WEIRD samples convey the similarities of Learning Analytics and Educational Data Mining. We conclude by discussing the importance of integrating multifaceted elements of the participant samples, including cultural values, societal values, and geographic areas, and present recommendations on ways to promote inclusion and diversity in Learning Analytics research.
- Published
- 2024
3. Conducting Power Analyses to Determine Sample Sizes in Quantitative Research: A Primer for Technology Education Researchers Using Common Statistical Tests
- Author
-
Jeffery Buckley
- Abstract
Ensuring a credible literature base is essential for all research fields. One element of this relates to the replicability of published work, which is the probability that the results of an original study would replicate in an independent investigation. A critical feature of replicable research is that the sample size of a study is sufficient to minimize statistical error and detect effects that exist in reality. A recent study (Buckley, Hyland, et al., 2023) estimated that the replicability of all quantitative technology education research is approximately 55% with this estimate showing an increasing trend in recent years. Given this estimate, it would be useful to invest efforts to improve replicability and thus credibility in the literature base in this way. Power analyses can be conducted when planning a quantitative study to support the determination of sample size requirements to detect population effects, however their existence in technology education research is rare. As the conduction of power analyses is a growing phenomenon in social scientific research more broadly, it is likely that one reason for their limited use by quantitative technology education researchers is a lack of resources within the field. As such, this article offers a primer for technology education researchers for conducting power analysis for common research designs within the field.
- Published
- 2024
4. Do Errors on Classic Decision Biases Happen Fast or Slow? Numeracy and Decision Time Predict Probability Matching, Sample Size Neglect, and Ratio Bias
- Author
-
Ryan Corser, Raymond P. Voss, and John D. Jasper
- Abstract
Higher numeracy is associated with better comprehension and use of numeric information as well as reduced susceptibility to some decision biases. We extended this line of work by showing that increased numeracy predicted probability maximizing (versus matching) as well as a better appreciation of large sample sizes. At the same time, we replicated the findings that the more numerate were less susceptible to the ratio bias and base rate neglect phenomena. Decision time predicted accuracy for the ratio bias, probability matching, and sample size scenarios, but not the base rate scenarios. Interestingly, this relationship between decision time and accuracy was positive for the ratio bias problems, but negative for the probability matching and sample size scenarios. Implications for research on cognitive ability and decision biases are discussed.
- Published
- 2024
5. The 3H and Spiral Dynamics Models: A Reconciliation
- Author
-
Jehanzeb Rashid Cheema
- Abstract
This study explores the relationship between the Spiral Dynamics and the 3H (head, heart, hands) models of human growth and development, using constructs such as empathy, moral reasoning, forgiveness, and community mindedness that have been shown to have implications for education. The specific research question is, "Can a combination of multivariate statistical techniques be utilized to find an alignment between the dimensions of these models?" We focus on practical and data-driven approaches with the primary methods including factor analysis, cluster analysis, and logistic regression. Our main finding is that the proposed methodology is robust and applicable in a variety of operational scenarios. We conclude it is feasible to empirically align and reconcile dimensions of seemingly disparate theories of educational development and human evolution with a data analysis framework based on mainstream quantitative techniques that can be easily implemented using readily available statistical software packages.
- Published
- 2024
6. Quantitative Techniques with Small Sample Sizes: An Educational Summer Camp Example
- Author
-
Trina Johnson Kilty, Kevin T. Kilty, Andrea C. Burrows Borowczak, and Mike Borowczak
- Abstract
A computer science camp for pre-collegiate students was operated during the summers of 2022 and 2023. The effect the camp had on attitudes was quantitatively assessed using a survey instrument. However, enrollment at the summer camp was small, which meant the well-known Pearson's Chi-Squared to measure the significance of results was not applied. Thus, a quantitative analysis method using a multinomial probability distribution as a model of a multilevel Likert scale survey was used. Exact calculations of a multinomial probability model with likelihood ratio were performed to quantitatively analyze the results of questionnaires administered to participants in two cohort groups (combined N=17). Probabilities per Likert categories were determined from the data itself using Bayes theorem with a Dirichlet prior. Each cohort functioned as part of a homogenous sample, thus allowing cohorts to be pooled. Post-test revealed significant changes in participants' attitudes after camp completion. Using this technique has implications for studies with small sample sizes. Using exact calculation of the multinomial probability model with the use of likelihood ratio as a statistical test of evidence has advantages: a) it is an exact value that can be used on any size sample, although it offers a quantitative analysis option for small sample size studies; b) depends only on what was observed during a study; c) does not require advanced calculation; d) modern spreadsheet and statistical package programs can calculate the analysis; and e) likelihood ratio employed in Bayes theorem can update prior beliefs according to evidence. Utilizing small sample size quantitative analysis can strengthen insights into data trends and showcase the importance of this quantitative technique.
- Published
- 2024
7. Reasoning Skills in Mathematics Teaching: A Meta-Synthesis on Studies Conducted in Turkey
- Author
-
Ali Tum
- Abstract
This research aims to analyze the results of studies conducted in Turkey on reasoning skills in mathematics teaching and to reveal what kind of trend there is in this field. Within the scope of this study, databases were searched with the keywords "reasoning"(muhakeme, akil yürütme) and "reasoning skill" (Muhakeme becerisi, akil yürütme becerisi), and the results were examined in accordance with the inclusion criteria regarding mathematics teaching. One hundred sixty-three studies were included. Each of the studies included in the meta-synthesis study was analyzed descriptively according to type, year, method, sample type and size, data collection tools, statistical analysis, learning field, keywords, reasoning type and purpose. In addition, the studies' results were content analyzed and tabulated by coding the differences and similarities between them with a holistic approach. It has been determined that studies on the learning fields in the mathematics curriculum are mostly carried out in the field of learning numbers and operations in the secondary school mathematics curriculum. When evaluated in terms of reasoning types, almost half of the studies were conducted on mathematical reasoning. It has been observed that after mathematical reasoning, the most focus is on proportional reasoning. When the aims of the studies included in the research were examined, it was determined that the most focused ones were "examining the factors affecting reasoning skills", "measurement of reasoning skills" and "the effect of teaching practices on reasoning skills". In the studies examined, it was seen that there were 33 teaching practices whose effects on reasoning skills were examined.
- Published
- 2024
8. Interventions for Gender Equality in STEM Education: A Meta-Analysis
- Author
-
Wenhao Yu, Jiaqi He, Julan Luo, and Xiaoyan Shu
- Abstract
Background: Although gender inequality in education is gaining increasing attention, female underrepresentation remains pervasive in STEM fields. Many studies have applied various interventions to narrow the gender gap in STEM education. Objectives: In this study, we conducted a systematic meta-analysis to examine the effectiveness of various interventions for gender equality in STEM education and tested the influence of moderator variables. Methods: Ten databases--ERIC, IEEE, JSTOR, ProQuest-Education, SAGE Journals, Scopus, Springer Link, Taylor & Francis, Web of Science, and Wiley Online Library--and seven high-quality journals in related fields were selected as literature resources. The Preferred Reporting Items for Systematic Reviews and Meta-analyses (PRISMA) process was used to identify eligible articles for the random-effects meta-analysis. Results and Conclusions: The overall effect size (Hedges' g) was 0.434, indicating that the interventions promoting gender equality in STEM education had a near-medium positive effect. There were significant differences by the type of intervention, but not by the type of intervention outcome. The results of the moderator analysis showed a significant difference in moderator variables, including country, educational level, form of experiment, intervention level, and different measurement methods, whereas there was no significant difference in the moderator variables, including economic development level and sample size. Implications: This study examined the effectiveness of gender equality interventions and the impact on it of different types of intervention, intervention outcomes, and moderator variables. These results have implications for the design of gender equality interventions in STEM education.
- Published
- 2024
- Full Text
- View/download PDF
9. Comparing Accuracy of Parallel Analysis and Fit Statistics for Estimating the Number of Factors with Ordered Categorical Data in Exploratory Factor Analysis
- Author
-
Hyunjung Lee and Heining Cham
- Abstract
Determining the number of factors in exploratory factor analysis (EFA) is crucial because it affects the rest of the analysis and the conclusions of the study. Researchers have developed various methods for deciding the number of factors to retain in EFA, but this remains one of the most difficult decisions in the EFA. The purpose of this study is to compare the parallel analysis with the performance of fit indices that researchers have started using as another strategy for determining the optimal number of factors in EFA. The Monte Carlo simulation was conducted with ordered categorical items because there are mixed results in previous simulation studies, and ordered categorical items are common in behavioral science. The results of this study indicate that the parallel analysis and the root mean square error of approximation (RMSEA) performed well in most conditions, followed by the Tucker-Lewis index (TLI) and then by the comparative fit index (CFI). The robust corrections of CFI, TLI, and RMSEA performed better in detecting misfit underfactored models than the original fit indices. However, they did not produce satisfactory results in dichotomous data with a small sample size. Implications, limitations of this study, and future research directions are discussed.
- Published
- 2024
- Full Text
- View/download PDF
10. When Who Matters: Interviewer Effects and Survey Modality
- Author
-
Rebecca Walcott, Isabelle Cohen, and Denise Ferris
- Abstract
When and how to survey potential respondents is often determined by budgetary and external constraints, but choice of survey modality may have enormous implications for data quality. Different survey modalities may be differentially susceptible to measurement error attributable to interviewer assignment, known as interviewer effects. In this paper, we leverage highly similar surveys, one conducted face-to-face (FTF) and the other via phone, to examine variation in interviewer effects across survey modality and question type. We find that while there are no cross-modality differences for simple questions, interviewer effects are markedly higher for sensitive questions asked over the phone. These findings are likely explained by the enhanced ability of in-person interviewers to foster rapport and engagement with respondents. We conclude with a thought experiment that illustrates the potential implications for power calculations, namely, that using FTF data to inform phone surveys may substantially underestimate the necessary sample size for sensitive questions.
- Published
- 2024
- Full Text
- View/download PDF
11. A Tutorial on Aggregating Evidence from Conceptual Replication Studies Using the Product Bayes Factor
- Author
-
Caspar J. Van Lissa, Eli-Boaz Clapper, and Rebecca Kuiper
- Abstract
The product Bayes factor (PBF) synthesizes evidence for an informative hypothesis across heterogeneous replication studies. It can be used when fixed- or random effects meta-analysis fall short. For example, when effect sizes are incomparable and cannot be pooled, or when studies diverge significantly in the populations, study designs, and measures used. PBF shines as a solution for small sample meta-analyses, where the number of between-study differences is often large relative to the number of studies, precluding the use of meta-regression to account for these differences. Users should be mindful of the fact that the PBF answers a qualitatively different research question than other evidence synthesis methods. For example, whereas fixed-effect meta-analysis estimates the size of a population effect, the PBF quantifies to what extent an informative hypothesis is supported in all included studies. This tutorial paper showcases the user-friendly PBF functionality within the bain R-package. This new implementation of an existing method was validated using a simulation study, available in an Online Supplement. Results showed that PBF had a high overall accuracy, due to greater sensitivity and lower specificity, compared to random-effects meta-analysis, individual participant data meta-analysis, and vote counting. Tutorials demonstrate applications of the method on meta-analytic and individual participant data. The example datasets, based on published research, are included in bain so readers can reproduce the examples and apply the code to their own data. The PBF is a promising method for synthesizing evidence for informative hypotheses across conceptual replications that are not suitable for conventional meta-analysis.
- Published
- 2024
- Full Text
- View/download PDF
12. Conducting Power Analysis for Meta-Analysis with Dependent Effect Sizes: Common Guidelines and an Introduction to the POMADE R Package
- Author
-
Mikkel Helding Vembye, James Eric Pustejovsky, and Therese Deocampo Pigott
- Abstract
Sample size and statistical power are important factors to consider when planning a research synthesis. Power analysis methods have been developed for fixed effect or random effects models, but until recently these methods were limited to simple data structures with a single, independent effect per study. Recent work has provided power approximation formulas for meta-analyses involving studies with multiple, dependent effect size estimates, which are common in syntheses of social science research. Prior work focused on developing and validating the approximations but did not address the practice challenges encountered in applying them for purposes of planning a synthesis involving dependent effect sizes. We aim to facilitate the application of these recent developments by providing practical guidance on how to conduct power analysis for planning a meta-analysis of dependent effect sizes and by introducing a new R package, "POMADE," designed for this purpose. We present a comprehensive overview of resources for finding information about the study design features and model parameters needed to conduct power analysis, along with detailed worked examples using the POMADE package. For presenting power analysis findings, we emphasize graphical tools that can depict power under a range of plausible assumptions and introduce a novel plot, the traffic light power plot, for conveying the degree of certainty in one's assumptions.
- Published
- 2024
- Full Text
- View/download PDF
13. Uncertain about Uncertainty in Matching-Adjusted Indirect Comparisons? A Simulation Study to Compare Methods for Variance Estimation
- Author
-
Conor O. Chandler and Irina Proskorovsky
- Abstract
In health technology assessment, matching-adjusted indirect comparison (MAIC) is the most common method for pairwise comparisons that control for imbalances in baseline characteristics across trials. One of the primary challenges in MAIC is the need to properly account for the additional uncertainty introduced by the matching process. Limited evidence and guidance are available on variance estimation in MAICs. Therefore, we conducted a comprehensive Monte Carlo simulation study to evaluate the performance of different statistical methods across 108 scenarios. Four general approaches for variance estimation were compared in both anchored and unanchored MAICs of binary and time-to-event outcomes: (1) conventional estimators (CE) using raw weights; (2) CE using weights rescaled to the effective sample size (ESS); (3) robust sandwich estimators; and (4) bootstrapping. Several variants of sandwich estimators and bootstrap methods were tested. Performance was quantified on the basis of empirical coverage probabilities for 95% confidence intervals and variability ratios. Variability was underestimated by CE + raw weights when population overlap was poor or moderate. Despite several theoretical limitations, CE + ESS weights accurately estimated uncertainty across most scenarios. Original implementations of sandwich estimators had a downward bias in MAICs with a small ESS, and finite sample adjustments led to marked improvements. Bootstrapping was unstable if population overlap was poor and the sample size was limited. All methods produced valid coverage probabilities and standard errors in cases of strong population overlap. Our findings indicate that the sample size, population overlap, and outcome type are important considerations for variance estimation in MAICs.
- Published
- 2024
- Full Text
- View/download PDF
14. Photovoice: Methodological Insights from a Multi-Site Online Design
- Author
-
Michelle C. Pasco, Anais Roque, Brittany Romanello, and Emir Estrada
- Abstract
Photovoice involves respondents taking photographs of their environment to promote critical discussions and reflect on their experiences. Photovoice empowers marginalized communities and serves to reach policymakers. The Arizona Youth Identity Project (AZYIP) used photovoice with an innovative approach in a multisite research design with a large sample size and completely online research implementation using video conferencing, mobile phones, and video messages. We outline our process for other researchers interested in utilizing this dynamic method. We also reflect on the challenges and opportunities of engaging in this research design for future projects.
- Published
- 2024
- Full Text
- View/download PDF
15. Estimating the Size of the Target Population in Data Limited Settings
- Author
-
Jamelia Harris
- Abstract
Not knowing the population size is a common problem in data-limited contexts. Drawing on work in Sierra Leone, this short take outlines a four-step solution to this problem: (1) estimate the population size using expert interviews; (2) verify estimates using interviews with participants sampled; (3) triangulate using secondary data; and (4) reconfirm using focus group discussions.
- Published
- 2024
- Full Text
- View/download PDF
16. Comparing the Efficacy of Fixed-Effects and MAIHDA Models in Predicting Outcomes for Intersectional Social Strata
- Author
-
Ben Van Dusen, Heidi Cian, Jayson Nissen, Lucy Arellano, and Adrienne D. Woods
- Abstract
This investigation examines the efficacy of multilevel analysis of individual heterogeneity and discriminatory accuracy (MAIHDA) over fixed-effects models when performing intersectional studies. The research questions are as follows: (1) What are typical strata representation rates and outcomes on physics research-based assessments? (2) To what extent do MAIHDA models create more accurate predicted strata outcomes than fixed-effects models? and (3) To what extent do MAIHDA models allow the modeling of smaller strata sample sizes? We simulated 3,000 data sets based on real-world data from 5,955 students on the LASSO platform. We found that MAIHDA created more accurate and precise predictions than fixed-effects models. We also found that using MAIHDA could allow researchers to disaggregate their data further, creating smaller group sample sizes while maintaining more accurate findings than fixed-effects models. We recommend using MAIHDA over fixed-effects models for intersectional investigations.
- Published
- 2024
- Full Text
- View/download PDF
17. Sample Size Calculation and Optimal Design for Multivariate Regression-Based Norming
- Author
-
Francesco Innocenti, Math J. J. M. Candel, Frans E. S. Tan, and Gerard J. P. van Breukelen
- Abstract
Normative studies are needed to obtain norms for comparing individuals with the reference population on relevant clinical or educational measures. Norms can be obtained in an efficient way by regressing the test score on relevant predictors, such as age and sex. When several measures are normed with the same sample, a multivariate regression-based approach must be adopted for at least two reasons: (1) to take into account the correlations between the measures of the same subject, in order to test certain scientific hypotheses and to reduce misclassification of subjects in clinical practice, and (2) to reduce the number of significance tests involved in selecting predictors for the purpose of norming, thus preventing the inflation of the type I error rate. A new multivariate regression-based approach is proposed that combines all measures for an individual through the Mahalanobis distance, thus providing an indicator of the individual's overall performance. Furthermore, optimal designs for the normative study are derived under five multivariate polynomial regression models, assuming multivariate normality and homoscedasticity of the residuals, and efficient robust designs are presented in case of uncertainty about the correct model for the analysis of the normative sample. Sample size calculation formulas are provided for the new Mahalanobis distance-based approach. The results are illustrated with data from the Maastricht Aging Study (MAAS).
- Published
- 2024
- Full Text
- View/download PDF
18. Detecting Careless Responding in Multidimensional Forced-Choice Questionnaires
- Author
-
Rebekka Kupffer, Susanne Frick, and Eunike Wetzel
- Abstract
The multidimensional forced-choice (MFC) format is an alternative to rating scales in which participants rank items according to how well the items describe them. Currently, little is known about how to detect careless responding in MFC data. The aim of this study was to adapt a number of indices used for rating scales to the MFC format and additionally develop several new indices that are unique to the MFC format. We applied these indices to a data set from an online survey (N = 1,169) that included a series of personality questionnaires in the MFC format. The correlations among the careless responding indices were somewhat lower than those published for rating scales. Results from a latent profile analysis suggested that the majority of the sample (about 76-84%) did not respond carelessly, although the ones who did were characterized by different levels of careless responding. In a simulation study, we simulated different careless responding patterns and varied the overall proportion of carelessness in the samples. With one exception, the indices worked as intended conceptually. Taken together, the results suggest that careless responding also plays an important role in the MFC format. Recommendations on how it can be addressed are discussed.
- Published
- 2024
- Full Text
- View/download PDF
19. Effects of the Quantity and Magnitude of Cross-Loading and Model Specification on MIRT Item Parameter Recovery
- Author
-
Mostafa Hosseinzadeh and Ki Lynn Matlock Cole
- Abstract
In real-world situations, multidimensional data may appear on large-scale tests or psychological surveys. The purpose of this study was to investigate the effects of the quantity and magnitude of cross-loadings and model specification on item parameter recovery in multidimensional Item Response Theory (MIRT) models, especially when the model was misspecified as a simple structure, ignoring the quantity and magnitude of cross-loading. A simulation study that replicated this scenario was designed to manipulate the variables that could potentially influence the precision of item parameter estimation in the MIRT models. Item parameters were estimated using marginal maximum likelihood, utilizing the expectation-maximization algorithms. A compensatory two-parameter logistic-MIRT model with two dimensions and dichotomous item-responses was used to simulate and calibrate the data for each combination of conditions across 500 replications. The results of this study indicated that ignoring the quantity and magnitude of cross-loading and model specification resulted in inaccurate and biased item discrimination parameter estimates. As the quantity and magnitude of cross-loading increased, the root mean square of error and bias estimates of item discrimination worsened.
- Published
- 2024
- Full Text
- View/download PDF
20. Extended-Release Mixed Amphetamine Salts for Comorbid Adult Attention-Deficit/Hyperactivity Disorder and Cannabis Use Disorder: A Pilot, Randomized Double-Blind, Placebo-Controlled Trial
- Author
-
Frances R. Levin, John J. Mariani, Martina Pavlicova, C. Jean Choi, Cale Basaraba, Amy L. Mahony, Daniel J. Brooks, Christina A. Brezing, and Nasir Naqvi
- Abstract
Objective: To determine if treatment of co-occurring adult ADHD and Cannabis Use Disorder (CUD) with extended-release mixed amphetamine salts (MAS-ER) would be effective at improving ADHD symptoms and promoting abstinence. Method: A 12-week randomized, double-blind, two-arm pilot feasibility trial of adults with comorbid ADHD and CUD (n = 28) comparing MAS-ER (80 mg) to placebo. Main outcomes: ADHD: [greater than or equal to] 30% symptom reduction, measured by the Adult ADHD Investigator Symptom Rating Scale (AISRS). CUD: Abstinence during last 2 observed weeks of maintenance phase. Results: Overall, medication was well-tolerated. There was no significant difference in ADHD symptom reduction (MAS-ER: 83.3%; placebo: 71.4%; p = 0.65) or cannabis abstinence (MAS-ER: 15.4%; placebo: 0%; p = 0.27). MAS-ER group showed a significant decrease in weekly cannabis use days over time compared to placebo (p < 0.0001). Conclusions: MAS-ER was generally well-tolerated. The small sample size precluded a determination of MAS-ER's superiority reducing ADHD symptoms or promoting abstinence. Notably, MAS-ER significantly reduced weekly days of use over time.
- Published
- 2024
- Full Text
- View/download PDF
21. Design and Analysis of Cluster Randomized Trials
- Author
-
Wei Li, Yanli Xie, Dung Pham, Nianbo Dong, Jessaca Spybrook, and Benjamin Kelcey
- Abstract
Cluster randomized trials (CRTs) are commonly used to evaluate the causal effects of educational interventions, where the entire clusters (e.g., schools) are randomly assigned to treatment or control conditions. This study introduces statistical methods for designing and analyzing two-level (e.g., students nested within schools) and three-level (e.g., students nested within classrooms nested within schools) CRTs. Specifically, we utilize hierarchical linear models (HLMs) to account for the dependency of the intervention participants within the same clusters, estimating the average treatment effects (ATEs) of educational interventions and other effects of interest (e.g., moderator and mediator effects). We demonstrate methods and tools for sample size planning and statistical power analysis. Additionally, we discuss common challenges and potential solutions in the design and analysis phases, including the effects of omitting one level of clustering, non-compliance, heterogeneous variance, blocking, threats to external validity, and cost-effectiveness of the intervention. We conclude with some practical suggestions for CRT design and analysis, along with recommendations for further readings.
- Published
- 2024
- Full Text
- View/download PDF
22. A Sample Size Formula for Network Scale-Up Studies
- Author
-
Nathaniel Josephs, Dennis M. Feehan, and Forrest W. Crawford
- Abstract
The network scale-up method (NSUM) is a survey-based method for estimating the number of individuals in a hidden or hard-to-reach subgroup of a general population. In NSUM surveys, sampled individuals report how many others they know in the subpopulation of interest (e.g. "How many sex workers do you know?") and how many others they know in subpopulations of the general population (e.g. "How many bus drivers do you know?"). NSUM is widely used to estimate the size of important sociological and epidemiological risk groups, including men who have sex with men, sex workers, HIV+ individuals, and drug users. Unlike several other methods for population size estimation, NSUM requires only a single random sample and the estimator has a conveniently simple form. Despite its popularity, there are no published guidelines for the minimum sample size calculation to achieve a desired statistical precision. Here, we provide a sample size formula that can be employed in any NSUM survey. We show analytically and by simulation that the sample size controls error at the nominal rate and is robust to some forms of network model mis-specification. We apply this methodology to study the minimum sample size and relative error properties of several published NSUM surveys.
- Published
- 2024
- Full Text
- View/download PDF
23. A Systematic Literature Review: The Self-Concept of Students with Learning Disabilities
- Author
-
Ayse Dilsad Yakut and Savas Akgul
- Abstract
Since the learning disability (LD) population comprises the largest group receiving special education services, there is a need for research to examine the self-concept of this population at a global level. This systematic literature review synthesized 20 years of quantitative research (k = 16) about the self-concept of students with LD. The overarching theme was that the diagnosis of LD relies on divergent criteria among the studies reviewed. While the academic self-concept was the center of the research, regardless of its domains, results indicated that students with LD had a lower level of self-concept. To have a deeper understanding of the phenomenon, an instrument specifically designed for assessing self-concept of students with LD is needed. Limitations of the study and implications for research and practice are discussed.
- Published
- 2024
- Full Text
- View/download PDF
24. Item Parameter Recovery: Sensitivity to Prior Distribution
- Author
-
Christine E. DeMars and Paulius Satkus
- Abstract
Marginal maximum likelihood, a common estimation method for item response theory models, is not inherently a Bayesian procedure. However, due to estimation difficulties, Bayesian priors are often applied to the likelihood when estimating 3PL models, especially with small samples. Little focus has been placed on choosing the priors for marginal maximum estimation. In this study, using sample sizes of 1,000 or smaller, not using priors often led to extreme, implausible parameter estimates. Applying prior distributions to the c-parameters alleviated the estimation problems with samples of 500 or more; for the samples of 100, priors on both the a-parameters and c-parameters were needed. Estimates were biased when the mode of the prior did not match the true parameter value, but the degree of the bias did not depend on the strength of the prior unless it was extremely informative. The root mean squared error (RMSE) of the a-parameters and b-parameters did not depend greatly on either the mode or the strength of the prior unless it was extremely informative. The RMSE of the c-parameters, like the bias, depended on the mode of the prior for c.
- Published
- 2024
- Full Text
- View/download PDF
25. Examining Faculty's Mastery of Subject Matter: A Student-Centered Analysis
- Author
-
Bueno, David Cababaro
- Abstract
The Doctor of Education (EdD) degree is vital in preparing individuals for leadership positions in educational settings. A key element of the EdD program is the faculty's proficiency in their specific fields of expertise. Assessing the faculty's mastery of subject matter is crucial for guaranteeing the quality of education and offering students a comprehensive and fulfilling learning experience. This case analysis examined the evaluation of EdD faculty's mastery of subject matter from the student's standpoint. In conclusion, students' ratings of their EdD professors' depth of knowledge and expertise varied, with some praising their professors for exceptional expertise and others expressing mixed opinions or dissatisfaction. However, it is important to consider that these ratings are subjective and may not provide a complete assessment of a professor's expertise. Factors such as sample size and specific context can influence students' perceptions. A more comprehensive evaluation that includes multiple students' opinions, objective measures of expertise, and overall learning outcomes would offer a more accurate assessment.
- Published
- 2023
26. Leadership Growth Over Multiple Semesters in Project-Based Student Teams Embedded in Faculty Research (Vertically Integrated Projects)
- Author
-
Julia Sonnenberg-Klein and Edward J. Coyle
- Abstract
Contribution: This longitudinal study modeled student leadership growth in a course sequence supporting long-term, large-scale, multidisciplinary projects embedded in faculty research. Students (half from computer science, computational media, electrical engineering, and computer engineering) participated for 1-4 semesters. Background: Project- based learning (PBL) is used widely in higher education. It is used in industry for leadership development, but leadership development in project-based learning (PBL) has not been explored in higher education. A preliminary analysis implied leadership growth through the third semester of participation, but the design did not control for attrition. Research Questions: At the student level, how do leadership role ratings change over multiple semesters of participation? Do first (and second) semester ratings differ by number of semesters students eventually participate? Methodology: The study involved two peer evaluation questions on 1) the degree to which students coordinated the team's work and 2) served as technical/content area leaders. Analysis employed analysis of variance to examine attrition by initial ratings (N = 1045) and multilevel growth modeling to study change over time (N = 585). A strength of using peer evaluations is the large sample size, but a weakness is that the tool was developed for student assessment and not educational research. The study did not control for participation in leadership programs outside the course. Findings: On average, individual leadership role ratings increased each semester through the third semester of participation. Ratings of students who left the program after 1 or 2 semesters did not differ from ratings for those who participated longer.
- Published
- 2024
- Full Text
- View/download PDF
27. DINA-BAG: A Bagging Algorithm for DINA Model Parameter Estimation in Small Samples
- Author
-
David Arthur and Hua-Hua Chang
- Abstract
Cognitive diagnosis models (CDMs) are the assessment tools that provide valuable formative feedback about skill mastery at both the individual and population level. Recent work has explored the performance of CDMs with small sample sizes but has focused solely on the estimates of individual profiles. The current research focuses on obtaining accurate estimates of skill mastery at the population level. We introduce a novel algorithm (bagging algorithm for deterministic inputs noisy "and" gate) that is inspired by ensemble learning methods in the machine learning literature and produces more stable and accurate estimates of the population skill mastery profile distribution for small sample sizes. Using both simulated data and real data from the Examination for the Certificate of Proficiency in English, we demonstrate that the proposed method outperforms other methods on several metrics in a wide variety of scenarios.
- Published
- 2024
- Full Text
- View/download PDF
28. Revisiting the Usage of Alpha in Scale Evaluation: Effects of Scale Length and Sample Size
- Author
-
Leifeng Xiao, Kit-Tai Hau, and Melissa Dan Wang
- Abstract
Short scales are time-efficient for participants and cost-effective in research. However, researchers often mistakenly expect short scales to have the same reliability as long ones without considering the effect of scale length. We argue that applying a universal benchmark for alpha is problematic as the impact of low-quality items is greater on shorter scales. In this study, we proposed simple guidelines for item reduction using the "alpha-if-item-deleted" procedure in scale construction. An item can be removed if alpha increases or decreases by less than 0.02, especially for short scales. Conversely, an item should be retained if alpha decreases by more than 0.04 upon its removal. For reliability benchmarks, 0.80 is relatively safe in most conditions, but higher benchmarks are recommended for longer scales and smaller sample sizes. Supplementary analyses, including item content, face validity, and content coverage, are critical to ensure scale quality.
- Published
- 2024
- Full Text
- View/download PDF
29. An Evaluation of Fit Indices Used in Model Selection of Dichotomous Mixture IRT Models
- Author
-
Sedat Sen and Allan S. Cohen
- Abstract
A Monte Carlo simulation study was conducted to compare fit indices used for detecting the correct latent class in three dichotomous mixture item response theory (IRT) models. Ten indices were considered: Akaike's information criterion (AIC), the corrected AIC (AICc), Bayesian information criterion (BIC), consistent AIC (CAIC), Draper's information criterion (DIC), sample size adjusted BIC (SABIC), relative entropy, the integrated classification likelihood criterion (ICL-BIC), the adjusted Lo-Mendell-Rubin (LMR), and Vuong-Lo-Mendell-Rubin (VLMR). The accuracy of the fit indices was assessed for correct detection of the number of latent classes for different simulation conditions including sample size (2,500 and 5,000), test length (15, 30, and 45), mixture proportions (equal and unequal), number of latent classes (2, 3, and 4), and latent class separation (no-separation and small separation). Simulation study results indicated that as the number of examinees or number of items increased, correct identification rates also increased for most of the indices. Correct identification rates by the different fit indices, however, decreased as the number of estimated latent classes or parameters (i.e., model complexity) increased. Results were good for BIC, CAIC, DIC, SABIC, ICL-BIC, LMR, and VLMR, and the relative entropy index tended to select correct models most of the time. Consistent with previous studies, AIC and AICc showed poor performance. Most of these indices had limited utility for three-class and four-class mixture 3PL model conditions.
- Published
- 2024
- Full Text
- View/download PDF
30. Using Multiple Imputation to Account for the Uncertainty Due to Missing Data in the Context of Factor Retention
- Author
-
Yan Xia and Selim Havan
- Abstract
Although parallel analysis has been found to be an accurate method for determining the number of factors in many conditions with complete data, its application under missing data is limited. The existing literature recommends that, after using an appropriate multiple imputation method, researchers either apply parallel analysis to every imputed data set and use the number of factors suggested by most of the data copies or average the correlation matrices across all data copies, followed by applying the parallel analysis to the average correlation matrix. Both approaches for pooling the results provide a single suggested number without reflecting the uncertainty introduced by missing values. The present study proposes the use of an alternative approach, which calculates the proportion of imputed data sets that result in k (k = 1, 2, 3 . . .) factors. This approach will inform applied researchers of the degree of uncertainty due to the missingness. Results from a simulation experiment show that the proposed method can more likely suggest the correct number of factors when missingness contributes to a large amount of uncertainty.
- Published
- 2024
- Full Text
- View/download PDF
31. Utilization of Differentiated Instruction in K-12 Classrooms: A Systematic Literature Review (2000-2022)
- Author
-
Linlin Hu
- Abstract
Differentiated instruction (DI) is a beneficial approach to addressing students' diverse learning needs, abilities, and interests to ensure that each student has the opportunity to make academic progress. To answer the question of how teachers utilize DI in K-12 classrooms, this systematic review was based on 61 empirical studies on DI published between 2000 and 2022. It examined the current status and trends of implementing DI in K-12 education and integrated various factors involved in the process of DI, including educational levels, subjects, student difference analysis, instructional methods, content, tools, assessment methods, and instructional effectiveness. The findings indicated that (1) DI was most commonly used in primary school mathematics and language classrooms, with the majority of studies having sample sizes exceeding 100 and lasting for more than 6 months; (2) The most frequently employed form of DI was ability grouping, often grouped based on academic achievement; (3) Information technology tools and resources can empower differentiated instruction; (4) Most studies utilized standardized tests, questionnaires, and scales as evaluation tools, with a focus on the impact of DI on students' academic achievement and skills; and (5) The effectiveness of DI was controversial and influenced by multiple factors, such as may be associated with the instructional methods. In response to these findings, the study introduces a comprehensive DI model. This model, rooted in the perspective of instructional design, elucidates the interconnected factors of DI. It serves as a valuable reference for the future design and implementation of DI, offering a practical guide for educators aiming to create inclusive and effective learning environments.
- Published
- 2024
- Full Text
- View/download PDF
32. An Interference Model for Visual and Verbal Working Memory
- Author
-
Klaus Oberauer and Hsuan-Yu Lin
- Abstract
Research on working memory (WM) has followed two largely independent traditions: One concerned with memory for sequentially presented lists of discrete items, and the other with short-term maintenance of simultaneously presented arrays of objects with simple, continuously varying features. Here we present a formal model of WM, the interference model (IM), that explains benchmark findings from both traditions: The shape of the error distribution from continuous reproduction of visual features, and how it is affected by memory set size; the effects of serial position for sequentially presented items, the effect of output position, and the intrusion of nontargets as a function of their distance from the target in space and in time. We apply the model to two experiments combining features of popular paradigms from both traditions: Lists of colors (Experiment 1) or of nonwords (Experiment 2) are presented sequentially and tested through selection of the target from a set of candidates, ordered by their similarity. The core assumptions of the IM are: Contents are encoded into WM through temporary bindings to contexts that serve as retrieval cues to access the contents. Bindings have limited precision on the context and the content dimension. A subset of the memory set--usually one item and its context--is maintained in a focus of attention with high precision. Successive events in an episode are encoded with decreasing strength, generating a primacy gradient. With each encoded event, automatic updating of WM reduces the strength of preceding memories, creating a recency gradient and output interference.
- Published
- 2024
- Full Text
- View/download PDF
33. The Analysis of Whose Verbal Behavior?
- Author
-
Paige Ellington and Tom Cariveau
- Abstract
Recent reviews of behavior analytic journals suggest that participant demographics are inadequately described. These reviews have been limited to brief periods across several journals, emphasized specific variables (e.g., socioeconomic status), or only included specific populations. The current scoping review included all published articles in "The Analysis of Verbal Behavior" from 1982-2020. Six demographic variables were coded for 1888 participants across 226 articles. Despite small sample sizes (i.e., fewer than six participants in 62.3% of studies), only age (85.4%) and gender identity (71.6%) were reported for the majority of participants. Socioeconomic status, race/ethnicity, and primary language were reported for fewer than 20% of participants. Over time, the number of demographic variables reported showed a slight increasing trend, although considerable variability was observed across years. These findings suggest that editors and reviewers must consider what constitutes acceptable participant characterization. Researchers might also be emboldened to extend their work to populations currently underrepresented in the journal.
- Published
- 2024
- Full Text
- View/download PDF
34. Sample Size and Item Parameter Estimation Precision When Utilizing the Masters' Partial Credit Model
- Author
-
Custer, Michael and Kim, Jongpil
- Abstract
This study utilizes an analysis of diminishing returns to examine the relationship between sample size and item parameter estimation precision when utilizing the Masters' Partial Credit Model for polytomous items. Item data from the standardization of the Batelle Developmental Inventory, 3rd Edition were used. Each item was scored with a "0" for no credit, a "1" for partial credit and a "2" for full credit. Two conditions were studied; the first used 40 items and the second 20 items. Item parameter estimates were examined relative to their "true" values by evaluating the decline in root mean squared error (RMSE) and the number of outliers as the level of sample size increased. The generalizability of these results may be limited due to this study's range of item score points, the number of items, and the range of the underlying scale. However, under both conditions, the majority of RMSE and outlier-reduction improvement was achieved once a sample size of 900 examinees had been reached. Diminishing returns were most notable with incremental increases to sample size beyond 1,000 examinees. Practitioners often encounter great variability in the number of items, item types, the number of item score points, the range of an assessment's underlying scale and the stakes associated with calibration objectives. Given this variability, practitioners might be encouraged to first simulate item data and vary the level of sample size to evaluate estimation precision across the scale through a similar review of RMSE and "outliers" as the level of sample size increases. Besides the consideration of sample acquisition costs, an evaluation of "diminishing returns" could aid practitioners in their selection of an appropriate sample size.
- Published
- 2023
35. Evaluating the Equality of Regression Coefficients for Multiple Group Comparisons: A Case of English Learner Subgroups by Home Languages
- Author
-
Yoo, Hanwook, Wolf, Mikyung Kim, and Ballard, Laura D.
- Abstract
As the theme of the 2022 annual meeting of the American Education Research Association, cultivating equitable education systems has gained renewed attention amid an increasingly diverse society. However, systemic inequalities persist for traditionally underserved student populations. As a way to better address diverse students' needs, it is of critical importance to understand different subgroups' performances. In the educational measurement field, evaluating the differences among multiple groups is an important consideration in addressing fairness issues for diverse groups of students. This article offers one technique to do so. It demonstrates how commonly-used multiple regression analysis can be applied to evaluate the equivalence of predictive structure across multiple groups in place of the factor analytic approach that requires a relatively large sample size per subgroup and strong assumptions. The technique is utilized in examining the relationship between English language proficiency and academic performance of English learners in one state when the subgroups are categorized by home language. The results showed statistically significant group differences between the reference group (Spanish-speaking ELs) and other focal groups (different home-language ELs) in various levels of comparisons (model fit, model structure, and individual predictor weights). The strengths and limitations of a proposed multiple group regression (MGR) approach are discussed in the educational research context.
- Published
- 2023
36. Effect of Inclusive Practices on Attitudes: A Meta-Analysis Study
- Author
-
Günay, Nazli Sila Yerliyurt, Elaldi, Senel, and Çifçi, Mehtap
- Abstract
With the study reported on here we aimed to synthesise recent quantitative research to specify the effect of inclusive practices on attitudes via meta-analysis. Since attitudes have an integral role in the performance of inclusion programmes, within the scope of the study, the cumulative findings of experimental studies conducted on attitudes towards inclusive practices were reinterpreted. To this end, studies with a pretest-posttest control group carried out in 2000 and 2021 were scanned from databases according to the inclusion criteria. After the search process, 23 studies that met the inclusion criteria were selected from 54 studies. The overall sample size of the studies included 2,016 participants. The mean effect size calculations, heterogeneity test, moderator analyses and publication bias analyses were conducted through a comprehensive meta-analysis programme (CMA 3.0). The findings that were discussed in accordance with the random effects model (REM) suggest that inclusive practices have a positive effect on attitudes and this effect is at a large level (g = 1.328) with respect to Cohen's classification. This result indicates that inclusive practices have been strongly influenced by positive attitudes to yield favourable results. According to the moderator analyses, the highest effect sizes were found in the teachers' group (g = 1,880) according to group level and in primary education (g = 1,374) according to school grades. The attitudes towards inclusion have been strongly influenced by teachers' beliefs about the power of their teaching. More empirical studies on inclusive practices are recommended.
- Published
- 2023
37. Using Auxiliary Data to Boost Precision in the Analysis of A/B Tests on an Online Educational Platform: New Data and New Results
- Author
-
Sales, Adam C., Prihar, Ethan B., Gagnon-Bartsch, Johann A., and Heffernan, Neil T.
- Abstract
Randomized A/B tests within online learning platforms represent an exciting direction in learning sciences. With minimal assumptions, they allow causal effect estimation without confounding bias and exact statistical inference even in small samples. However, often experimental samples and/or treatment effects are small, A/B tests are underpowered, and effect estimates are overly imprecise. Recent methodological advances have shown that power and statistical precision can be substantially boosted by coupling design-based causal estimation to machine-learning models of rich log data from historical users who were not in the experiment. Estimates using these techniques remain unbiased and inference remains exact without any additional assumptions. This paper reviews those methods and applies them to a new dataset including over 250 randomized A/B comparisons conducted within ASSISTments, an online learning platform. We compare results across experiments using four novel deep-learning models of auxiliary data and show that incorporating auxiliary data into causal estimates is roughly equivalent to increasing the sample size by 20% on average, or as much as 50-80% in some cases, relative to t-tests, and by about 10% on average, or as much as 30-50%, compared to cutting-edge machine learning unbiased estimates that use only data from the experiments. We show that the gains can be even larger for estimating subgroup effects, hold even when the remnant is unrepresentative of the A/B test sample, and extend to post-stratification population effects estimators.
- Published
- 2023
38. The Power and Type I Error of Wilcoxon-Mann-Whitney, Welch's 't,' and Student's 't' Tests for Likert-Type Data
- Author
-
Simsek, Ahmet Salih
- Abstract
Likert-type item is the most popular response format for collecting data in social, educational, and psychological studies through scales or questionnaires. However, there is no consensus on whether parametric or non-parametric tests should be preferred when analyzing Likert-type data. This study examined the statistical power of parametric and non-parametric tests when each Likert-type item was analyzed independently in survey studies. The main purpose of the study is to examine the statistical power of Wilcoxon-Mann-Whitney, Welch's t, and Student's t tests for Likert-type data, which are pairwise comparison tests. For this purpose, a Monte Carlo simulation study was conducted. The statistical significance of the selected tests was examined under the conditions of sample size, group size ratio, and effect size. The results showed that the Wilcoxon-Mann-Whitney test was superior to its counterparts, especially for small samples and unequal group sizes. However, the Student's t-test for Likert-type data had similar statistical power to the Wilcoxon-Mann-Whitney test under conditions of equal group sizes when the sample size was 200 or more. Consistent with the empirical results, practical recommendations were provided for researchers on what to consider when collecting and analyzing Likert-type data.
- Published
- 2023
39. A Comparison of the Efficacies of Differential Item Functioning Detection Methods
- Author
-
Basman, Munevver
- Abstract
To ensure the validity of the tests is to check that all items have similar results across different groups of individuals. However, differential item functioning (DIF) occurs when the results of individuals with equal ability levels from different groups differ from each other on the same test item. Based on Item Response Theory and Classic Test Theory, there are some methods, with different advantages and limitations to identify items that show DIF. This study aims to compare the performances of five methods for detecting DIF. The efficacies of Mantel-Haenszel (MH), Logistic Regression (LR), Crossing simultaneous item bias test (CSIBTEST), Lord's chi-square (LORD), and Raju's area measure (RAJU) methods are examined considering conditions of the sample size, DIF ratio, and test length. In this study, to compare the detection methods, power and Type I error rates are evaluated using a simulation study with 100 replications conducted for each condition. Results show that LR and MH have the lowest Type I error and the highest power rate in detecting uniform DIF. In addition, CSIBTEST has a similar power rate to MH and LR. Under DIF conditions, sample size, DIF ratio, test length and their interactions affect Type I error and power rates.
- Published
- 2023
40. Effects of After-School Programs on Student Cognitive and Non-Cognitive Abilities: A Meta-Analysis Based on 37 Experimental and Quasi-Experimental Studies
- Author
-
Yao, Jing, Yao, Jijun, Li, Peixuan, Xu, Yifan, and Wei, Lai
- Abstract
The after-school program is a crucial initiative for implementing the Double Reduction policy; however, prior research has not provided conclusive evidence on whether extended school hours contribute to students' cognitive and non-cognitive development or on which types of after-school services are more beneficial for student development. This study analyzed 37 after-school programs from 18 publications using meta-analytic techniques, and the results indicated that participation in after-school programs had positive effects on student cognitive and non-cognitive development despite the small effect size (d = 0.327, p = 0.000). The decomposition of the effects of after-school programs revealed that they had modestly positive effects on academic achievement (d = 0.369) and social-emotional competence (d = 0.220). In addition, the analysis of moderating variables revealed that socioeconomic status, educational phase, number of after-school service days per week, sample size, and testing instrument all influenced the after-school program effects. This study concludes, based on the results of the meta-analysis, that there should be a balanced consideration of the development of student cognitive and non-cognitive abilities in planning after-school service, a substantial variety of activities in afterschool programs, a flexible adoption of diverse after-school programs, and a reasonable participation frequency in after-school service.
- Published
- 2023
41. Use of Generative Adversarial Networks (GANs) in Educational Technology Research
- Author
-
Bethencourt-Aguilar, Anabel, Castellanos-Nieves, Dagoberto, Sosa-Alonso, Juan-José, and Area-Moreira, Manuel
- Abstract
In the context of Artificial Intelligence, Generative Adversarial Nets (GANs) allow the creation and reproduction of artificial data from real datasets. The aims of this work are to seek to verify the equivalence of synthetic data with real data and to verify the possibilities of GAN in educational research. The research methodology begins with the creation of a survey that collects data related to the self-perceptions of university teachers regarding their digital competence and technological-pedagogical knowledge of the content (TPACK model). Once the original dataset is generated, twenty-nine different synthetic samples are created (with an increasing N) using the COPULA-GAN procedure. Finally, a two-stage cluster analysis is applied to verify the interchangeability of the synthetic samples with the original, in addition to extracting descriptive data of the distribution characteristics, thereby checking the similarity of the qualitative results. In the results, qualitatively very similar cluster structures have been obtained in the 150 tests carried out, with a clear tendency to identify three types of teaching profiles, based on their level of technical-pedagogical knowledge of the content. It is concluded that the use of synthetic samples is an interesting way of improving data quality, both for security and anonymization and for increasing sample sizes.
- Published
- 2023
42. Thematic Content Analysis of Postgraduate Theses on Epistemological Beliefs in Science Education: The Türkiye Context
- Author
-
Sevinç Kaçar
- Abstract
The aim of the study was to examine postgraduate theses on epistemological beliefs in science education in Türkiye. Data collected included the publication years, researcher genders, universities, disciplines, aims, methods, sample/study groups, time allocated to the research, and data collection tools. The thematic content analysis method was used in the study. The data were obtained from the doctoral and master's thesis published until 2022 (including 2022) inclusive held at the CoHE National Thesis Centre. Access was gained to 149 theses dealing with the subject of epistemological beliefs in science education. The theses in the study were classified with reference to the matrix prepared by Ormanci, Çepni, Deveci and Aydin. The data obtained were analysed using content and descriptive analysis methods. The majority of the theses aimed to investigate the effect of a certain learning-teaching method on epistemological beliefs and the relationship between epistemological beliefs and some variables. It was determined that scales and questionnaires were mostly used as data collection tools in the evaluation of epistemological beliefs. There is a need for studies on the effect of current science learning-teaching methods on the development of epistemological beliefs or the relationship between epistemological beliefs and 21st-century skills.
- Published
- 2023
43. Effect of Missing Data on Test Equating Methods Under NEAT Design
- Author
-
Semih Asiret and Seçil Ömür Sünbül
- Abstract
In this study, it was aimed to examine the effect of missing data in different patterns and sizes on test equating methods under the NEAT design for different factors. For this purpose, as part of this study, factors such as sample size, average difficulty level difference between the test forms, difference between the ability distribution, missing data rate, and missing data mechanisms were manipulated. The effects of these factors on the equating error of test equating methods (chained-equipercentile equating, Tucker, frequency estimation equating, and Braun-Holland) were investigated. In the study, two separate sets of 10,000 dichotomous data were generated consistent with a 2-parameter logistic model. While generating data, the MCAR and MAR missing data mechanisms were used. All analyses were conducted by R 4.2.2. As a result of the study, it was seen that the RMSE of the equating methods increased significantly as the missing data rate increased. The results indicate that the RMSE of the equating methods with imputed missing data are reduced compared to equating without imputed missing data. Furthermore, the percentage of missing data, along with the difference between ability levels and the average difficulty difference between forms, was found to significantly affect equating errors in the presence of missing data. Although increasing sample size did not have a significant effect on equating error in the presence of missing data, it did lead to more accurate equating when there was no missing data present.
- Published
- 2023
44. Comparison of Cronbach's Alpha and McDonald's Omega for Ordinal Data: Are They Different?
- Author
-
Fatih Orcan
- Abstract
Among all, Cronbach's Alpha and McDonald's Omega are commonly used for reliability estimations. The alpha uses inter-item correlations while omega is based on a factor analysis result. This study uses simulated ordinal data sets to test whether the alpha and omega produce different estimates. Their performances were compared according to the sample size, number of items, and deviance from tau equivalence. Based on the result, the alpha and omega had similar results, except for the small sample size, the smaller number of items, and the low factor loading values. When there were 5 or more items in the scale and factor analysis which the omega was calculated from showed fit to the data set, using omega over alpha could be preferred. Also, as the number of items exceeds 5, the alpha and omega differences disappear. Since calculating the alpha is easier compared to the omega (omega requires fitting a factor model first) using alpha over omega can also be suggested. However, when the number of items and the correlations among the items were small, omega performed worse than alpha. Therefore, alpha should be used for the reliability estimations.
- Published
- 2023
45. Type I Error and Power Rates: A Comparative Analysis of Techniques in Differential Item Functioning
- Author
-
Ayse Bilicioglu Gunes and Bayram Bicak
- Abstract
The main purpose of this study is to examine the Type I error and statistical power ratios of Differential Item Functioning (DIF) techniques based on different theories under different conditions. For this purpose, a simulation study was conducted by using Mantel-Haenszel (MH), Logistic Regression (LR), Lord's [chi-squared], and Raju's Areas Measures techniques. In the simulation-based research model, the two-parameter item response model, group's ability distribution, and DIF type were the fixed conditions while sample size (1800, 3000), rates of sample size (0.50, 1), test length (20, 80) and DIF-containing item rate (0, 0.05, 0.10) were manipulated conditions. The total number of conditions is 24 (2x2x2x3), and statistical analysis was performed in the R software. The current study found that the Type I error rates in all conditions were higher than the nominal error level. It was also demonstrated that MH had the highest error rate while Raju's Areas Measures had the lowest error rate. Also, MH produced the highest statistical power rates. The analysis of the findings of Type 1 error and statistical power rates illustrated that techniques based on both of the theories performed better in the 1800 sample size. Furthermore, the increase in the sample size affected techniques based on CTT rather than IRT. Also, the findings demonstrated that the techniques' Type 1 error rates were lower while their statistical power rates were higher under conditions where the test length was 80, and the sample sizes were not equal.
- Published
- 2023
46. Alternative Methods for Item Parameter Estimation: From CTT to IRT. Research Report. ETS RR-22-12
- Author
-
Guo, Hongwen, Lu, Ru, Johnson, Matthew S., and McCaffrey, Dan F.
- Abstract
It is desirable for an educational assessment to be constructed of items that can differentiate different performance levels of test takers, and thus it is important to estimate accurately the item discrimination parameters in either classical test theory or item response theory. It is particularly challenging to do so when the sample sizes are small. The current study reexamined the relationship between the biserial correlation coefficient and the discrimination parameter to investigate whether the biserial correlation coefficient estimator could be modified and whether biserial-based estimators could be used as alternate estimates of the item discrimination indices. Results show that the modified and alternative approaches work slightly better under certain circumstances (e.g., for small sample sizes or shorter tests), assuming normality of the latent ability distribution. Applications of these alternative estimators are presented in item scaling and weighted differential item functioning analyses. Recommendations and limitations are discussed for practical use of these proposed methods.
- Published
- 2022
47. Meta-Analyses of Partial Correlations Are Biased: Detection and Solutions
- Author
-
T. D. Stanley, Hristos Doucouliagos, and Tomas Havranek
- Abstract
We demonstrate that all meta-analyses of partial correlations are biased, and yet hundreds of meta-analyses of partial correlation coefficients (PCCs) are conducted each year widely across economics, business, education, psychology, and medical research. To address these biases, we offer a new weighted average, UWLS[subscript +3]. UWLS[subscript +3] is the unrestricted weighted least squares weighted average that makes an adjustment to the degrees of freedom that are used to calculate partial correlations and, by doing so, renders trivial any remaining meta-analysis bias. Our simulations also reveal that these meta-analysis biases are small-sample biases (n < 200), and a simple correction factor of (n - 2)/(n - 1) greatly reduces these small-sample biases along with Fisher's z. In many applications where primary studies typically have hundreds or more observations, partial correlations can be meta-analyzed in standard ways with only negligible bias. However, in other fields in the social and the medical sciences that are dominated by small samples, these meta-analysis biases are easily avoidable by our proposed methods.
- Published
- 2024
- Full Text
- View/download PDF
48. Statistical Power Analysis and Sample Size Planning for Moderated Mediation Models
- Author
-
Ziqian Xu, Fei Gao, Anqi Fa, Wen Qu, and Zhiyong Zhang
- Abstract
Conditional process models, including moderated mediation models and mediated moderation models, are widely used in behavioral science research. However, few studies have examined approaches to conduct statistical power analysis for such models and there is also a lack of software packages that provide such power analysis functionalities. In this paper, we introduce new simulation-based methods for power analysis of conditional process models with a focus on moderated mediation models. These simulation-based methods provide intuitive ways for sample-size planning based on regression coefficients in a moderated mediation model as well as selected variance and covariance components. We demonstrate how the methods can be applied to five commonly used moderated mediation models using a simulation study, and we also assess the performance of the methods through the five models. We implement our approaches in the WebPower R package and also in Web apps to ease their application. [This is the online version of an article published in "Behavior Research Methods."]
- Published
- 2024
- Full Text
- View/download PDF
49. Does Entrepreneurial Training Change Minds? A Case Study among Southeast Asian Business Students
- Author
-
Thi Minh Hang Le, Ha Hoang, and Son-Tung Nguyen
- Abstract
This study aimed to determine the relationship between entrepreneurial training and intention among Southeast Asian students who are influenced by Confucianism. The conceptual model was tested with a sample size of 281 students enrolled in a business administration program. The most significant findings from the study were: (i) students' entrepreneurial intention increased significantly; (ii) all research factors had a positive influence on entrepreneurial intention; (iii) students had lower confidence in their business abilities and capabilities; (iv) students' perceptions of peer and family support improved; and (v) there was no relationship between family characteristics (i.e. parents' occupation and students' place of residence) and entrepreneurial intention. These results provide a foundation for developing entrepreneurship training programs that promote students' entrepreneurial intention in Southeast Asian countries.
- Published
- 2024
- Full Text
- View/download PDF
50. An Experimental Study of Supervised Machine Learning Techniques for Minor Class Prediction Utilizing Kernel Density Estimation: Factors Impacting Model Performance
- Author
-
Abdullah Mana Alfarwan
- Abstract
This dissertation examined classification outcome differences among four popular individual supervised machine learning (ISML) models (logistic regression, decision tree, support vector machine, and multilayer perceptron) when predicting minor class membership within imbalanced datasets. The study context and the theoretical population sampled focus on one aspect of the larger problem of student retention and dropout prediction in higher education (HE): identification. This study differs from current literature by implementing an experimental design approach with simulated student data that closely mirrors HE situational and student data. Specifically, this study tested the predictive ability of the four ISML classification models (CLS) under experimentally manipulated conditions. These included total sample size (TS), minor class proportion (MCP), training-to-testing sample size ratios (TTSS), and the application of bagging techniques during model training (BAG). Using this 4-between, 1-within mixed design, five different outcome measures (precision, recall/sensitivity, specificity, F1-score and AUC) were examined and analyzed individually. For each outcome measure, findings revealed multiple statistically significant interactions among classifier models and design variables. Simple effect analyses of these interactions highlighted how TS, MCP, TTSS, and BAG differentially affect different measures of classification performance such as precision, recall/sensitivity, specificity, F1-score, and AUC. For instance, the presence of interactions involving MCP underscores the importance of informed modeling of class distribution for enhancing overall model predictive capability and performance. Such insights regarding how the experimental variables can critically affect different measures of classification success advances our understanding of how these four ISML models might be optimized for the prediction of student-at-risk status within imbalanced datasets. This dissertation provides a framework for using these or similar ISML models more effectively in HE. It points toward the development of predictive modeling methods that are more useful and perhaps equitable by demonstrating empirically the impact of one of the most challenging aspects of implementing machine learning in HE: maximizing the accurate identification of the minority class. This work contributes to the use of machine learning in HE and will help inform its use in smaller and larger educational research communities by providing strategies for improving the prediction of student dropout. [The dissertation citations contained here are published with the permission of ProQuest LLC. Further reproduction is prohibited without permission. Copies of dissertations may be obtained by Telephone (800) 1-800-521-0600. Web page: http://www.proquest.com/en-US/products/dissertations/individuals.shtml.]
- Published
- 2024
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.