Descriptor: "Multivariate Analysis" - Searchworks@Jio Institute Digital Library Search Results

Your search keyword '"Multivariate Analysis"' showing total 545,973 results

Start Over Descriptor "Multivariate Analysis"

545,973 results on '"Multivariate Analysis"'

1. Detection of Differential Item Functioning with Latent Class Analysis: PISA 2018 Mathematical Literacy Test

Author: Selim Dasçioglu and Tuncay Ögretmen
Abstract: The purpose of this research is to determine whether PISA 2018 mathematical literacy test items show a differential item functioning across countries. For this purpose, only the items in booklet number three were examined using the MIMIC method with Latent Class Analysis (LCA) approach. PISA 2018 tests are mostly developed in English. Therefore, in DIF analyses, the reference group is the UK, while the focal groups consist of the other countries examined in the research (Türkiye, Finland, Japan, and the USA). According to the results, of the 23 test items, statistically significant DIF was observed in eight items in the UK-Türkiye sample, in seven items in the UK-Finland sample, in eleven items in the UK-Japan sample, and in three items in the UK-USA sample. It is seen that the effect and size of DIF in non-homogeneous groups differ between groups and these effects can be examined in more detail with the LCA method.
Published: 2024

2. Student Self-Regulated Learning in Teacher Professional Vision: Results from Combining Student Self-Reports, Teacher Ratings, and Mobile Eye Tracking in the High School Classroom

Author: Kateryna Horlenko, Lina Kaminskiene, and Erno Lehtinen
Abstract: Teacher professional vision as a concept is gaining importance in research on teaching, and recently models for studying teacher professional vision and student self-regulated learning (SRL) have been proposed. There are interview and video intervention studies investigating teacher professional vision for SRL, but no real-life classroom research so far. This study investigated the role of student SRL behaviour, as it was reported by students themselves and teachers, in teacher attention distribution as part of teacher professional vision. Ten teachers and their 158 students at high school level in Lithuania took part in the research. The first step of the study resulted in identifying four student SRL-profiles, which differed based on student level of SRL and the extent to which teacher and student assessments coincided: mixed lower-regulated, mixed higher-regulated, systematic lower-regulated, systematic higher-regulated. The profiles demonstrated only a partial overlap in teacher and student judgement of student SRL. The second step of the study explored whether scores of students' SRL from student and teacher reports were related to teachers' distribution of visual attention in one lesson. The results showed that only one teacher rating scale of student information-seeking behaviour had a slight correlation with teacher attention. The results imply rather bottom-up trends in teacher attention to students in the classroom when it comes to SRL. Besides, the study results highlight the not directly observable nature of SRL processes and imply a difficulty for teachers to assess student SRL.
Published: 2024

3. A Proposed Categorization of Meta-Analysis, Their Respective Example Conceptual Frameworks, and Applicable R Packages for Education Research: A Review

Author: Gamon Savatsomboon, Phamornpun Yurayat, Ong-art Chanprasitchai, Warawut Narkbunnum, Jibon Kumar Sharma, and Surapol Svetsomboon
Abstract: The paper has three major objectives. The first objective of the paper is to synthesize and define common categories of meta-analysis. The second objective is to propose a way to comprehend these common categories of meta-analysis through learning from their respective generic conceptual frameworks. The third objective is to point out which R packages to use to conduct a meta-analysis for each category presented in this paper. In practice, research novices may not know where to begin to learn a topic of meta-analysis for their research. Furthermore, they may not know how the topic of meta-analysis progresses from basics to advanced. This makes it almost next to impossible to use meta-analysis in their research. Certainly, this is problematic. Thus, this paper intends to help research novices overcome this problem. To achieve this, the three objectives mentioned at the beginning were implemented. To propose a synthesis of common categories of meta-analysis along with their respective definitions, a literature review was used as a method to synthesize and define the common categories of meta-analysis presented in this paper. Here, synthesizing meta-analysis categories in one figure would make it much easier for research novices to see the breadth of the subject in question. To propose an easy way to understand the concepts of meta-analysis through learning from their respective generic conceptual frameworks, personal experience was used as a method to make that offering. Here, it is encouraged that research novices should go beyond the definitions of categories of meta-analysis. They can try to frame their research by using example conceptual frameworks presented in the paper as a starting point. To point out which R packages to use with what categories of meta-analysis, a literature review was used as a method to identify those appropriate R packages available. Here, it would be useful for research novices to know which R packages they may need for their meta-analysis research. In summary, the contributions of the paper here would make research novices see and understand the common categories of meta-analysis from the bottom to the top of the pyramid, also offers an easy way to understand the common categories of meta-analysis by illustrating how to develop conceptual frameworks for the common categories of meta-analysis, and finally recommending the R packages to be used with the categories of meta-analysis. Ultimately, we hope that research novices will be able to use meta-analysis successfully in their research.
Published: 2024

4. Communicational Model: Typology of Academic Writing Assessment in Spanish in Higher Education

Author: Benito Ilich Suárez-Bedolla, Francisco Cervantes-Pérez, and Beatriz Feijoó-Fernández
Abstract: A common diagnosis in the literature is that the writing of Spanish speakers is generally a structural problem. The writing of 81 university students was analysed by classifying the teacher's comments based on 11 variables that were recorded once during a continuous evaluation that supported the assessment. The techniques used were Content Analysis, Cluster Analysis and Self-Study. Four Clusters were identified in which the dominant conception of communication is broken down, while one Cluster was identified with the alternative conception of communication. This expresses compliance with the task instructions, while the remaining Clusters show progressive non-compliance in various measures. The five Clusters form a typology and, with the theoretical assumption of the identification of the respective conceptions, correspond to a situated communication model. It could be applied in the planning of teaching workloads, as well as in the semi-automatic checking of written productions through the implementation of AI solutions.
Published: 2024

5. Using R for Multivariate Meta-Analysis on Educational Psychology Data: A Method Study

Author: Gamon Savatsomboon, Prasert Ruannakarn, Phamornpun Yurayat, Ong-art Chanprasitchai, and Jibon Kumar Sharma Leihaothabam
Abstract: Using R to conduct univariate meta-analyses is becoming common for publication. However, R can also conduct multivariate meta-analysis (MMA). However, newcomers to both R and MMA may find using R to conduct MMA daunting. Given that, R may not be easy for those unfamiliar with coding. Likewise, MMA is a topic of advanced statistics. Thus, it may be very challenging for most newcomers to conduct MMA using R. If this holds, this can be viewed as a practice gap. In other words, the practice gap is that researchers are not capable of using R to conduct MMA in practice. This is problematic. This paper alleviates this practice gap by illustrating how to use R (the metaSEM package) to conduct MMA on educational psychology data. Here, the metaSEM package is used to obtain the required MMA text outputs. However, the metaSEM package is not capable of producing the other required graphical outputs. As a result, the metafor package is also used as a complimentary to generate the required graphical outputs. Ultimately, we hope that our audience will be able to apply what they learn from this method paper to conduct MMA using R in their teaching, research, and publication.
Published: 2024

6. Sample Size Calculation and Optimal Design for Multivariate Regression-Based Norming

Author: Francesco Innocenti, Math J. J. M. Candel, Frans E. S. Tan, and Gerard J. P. van Breukelen
Abstract: Normative studies are needed to obtain norms for comparing individuals with the reference population on relevant clinical or educational measures. Norms can be obtained in an efficient way by regressing the test score on relevant predictors, such as age and sex. When several measures are normed with the same sample, a multivariate regression-based approach must be adopted for at least two reasons: (1) to take into account the correlations between the measures of the same subject, in order to test certain scientific hypotheses and to reduce misclassification of subjects in clinical practice, and (2) to reduce the number of significance tests involved in selecting predictors for the purpose of norming, thus preventing the inflation of the type I error rate. A new multivariate regression-based approach is proposed that combines all measures for an individual through the Mahalanobis distance, thus providing an indicator of the individual's overall performance. Furthermore, optimal designs for the normative study are derived under five multivariate polynomial regression models, assuming multivariate normality and homoscedasticity of the residuals, and efficient robust designs are presented in case of uncertainty about the correct model for the analysis of the normative sample. Sample size calculation formulas are provided for the new Mahalanobis distance-based approach. The results are illustrated with data from the Maastricht Aging Study (MAAS).
Published: 2024
Full Text: View/download PDF

7. Design and Analysis of Cluster Randomized Trials

Author: Wei Li, Yanli Xie, Dung Pham, Nianbo Dong, Jessaca Spybrook, and Benjamin Kelcey
Abstract: Cluster randomized trials (CRTs) are commonly used to evaluate the causal effects of educational interventions, where the entire clusters (e.g., schools) are randomly assigned to treatment or control conditions. This study introduces statistical methods for designing and analyzing two-level (e.g., students nested within schools) and three-level (e.g., students nested within classrooms nested within schools) CRTs. Specifically, we utilize hierarchical linear models (HLMs) to account for the dependency of the intervention participants within the same clusters, estimating the average treatment effects (ATEs) of educational interventions and other effects of interest (e.g., moderator and mediator effects). We demonstrate methods and tools for sample size planning and statistical power analysis. Additionally, we discuss common challenges and potential solutions in the design and analysis phases, including the effects of omitting one level of clustering, non-compliance, heterogeneous variance, blocking, threats to external validity, and cost-effectiveness of the intervention. We conclude with some practical suggestions for CRT design and analysis, along with recommendations for further readings.
Published: 2024
Full Text: View/download PDF

8. Theoretical Constructs that Explain and Enhance Learning: A Longitudinal Examination

Author: Phan, Huy P.
Abstract: One important line of inquiry in educational psychology involves the study of change of individuals' cognitive-motivational processes. The conjunctive use of longitudinal data with latent growth curve modeling procedures has, for example, allowed researchers to identify initial levels and to trace trajectories of theoretical variables such as self-efficacy over time. The study reported in this article proposed a conceptual model that depicted relations between a deep-learning approach, mastery goals and self-efficacy over time. A final sample of 195 second-year university students (100 females, 95 males) took part in this three-wave panel study. We used various inventories to test the initial states and rates of change of the three aforementioned constructs. As an a posteriori analysis, we included prior academic achievement as a possible predictor of change. The results ascertained from our analyses indicate an increase in growth of a deep-learning approach, mastery goals and self-efficacy across the two-year period. Importantly, a posteriori results accentuated the role of prior academic achievement as a predictor of the initial level of personal self-efficacy.
Published: 2024
Full Text: View/download PDF

9. Vocational Identity Statuses among Hong Kong Sub-Degree Students: Pattern Identification and Relationship to Career Development and Academic Performance

Author: Raysen Cheung, Qiuping Jin, Ka Kit Yeung, Hin Long Lau, and Wing Hong Chui
Abstract: The process model of vocational identity was well applied in various Western countries to study the vocational identity process and statuses of college students. However, such research is limited in Hong Kong. Moreover, the relation between vocational identity development and academic performance was inconclusive in the literature, and it was also not tested among Hong Kong students. In light of these, the current study aimed to empirically identify and validate vocational identity statuses among a sample of 576 sub-degree students in Hong Kong using the vocational identity process model. Relations of vocational identity processes and statuses with perceived academic performance were also tested. Six vocational identity statuses were empirically derived in the Hong Kong Chinese context. Vocational identity statuses also differentiate perceived academic performance. Moreover, we found that career flexibility and self-doubt were significantly related to perceived academic performance. Implications of the results for theory and practice are discussed.
Published: 2024
Full Text: View/download PDF

10. Evaluating the Classroom Environment: Multilevel Validation and Measurement Invariance of Classroom Behavioral Climate

Author: Demos Michael, Nikolaos Tsigilis, Victoria Michaelidou, Athanasios Gregoriadis, Vicky Charalambous, and Charalambos Vrasidas
Abstract: The present study contributes to the development of effective measures to evaluate classroom climate, especially in elementary education where these remain limited. In addition, it addresses a typical "flaw" of several studies by approaching classroom climate as a group-level construct rather than an individual characteristic. Following a multilevel approach, we attempted to establish validity evidence for an adapted version of the Classroom Behavioral Climate (CBC) questionnaire and examine its equivalence across two contexts. The participants included 2963 students from 176 fourth and fifth grade classes. Multilevel Confirmatory Factor Analyses resulted in a shorter version of the CBC with three correlated factors in both countries. Two-level models performed better than the single-level models. Configural, metric, and partial scalar invariance was also achieved. The study highlights the need for researchers and practitioners to use appropriate methods and instruments to evaluate the classroom environment based on their research aims and intended use.
Published: 2024
Full Text: View/download PDF

11. Using Summary Tables to Introduce Principal Component Analysis in an Elementary Data Science Course

Author: Jon-Paul Paolino
Abstract: This article presents a novel approach to introducing principal component analysis (PCA), using summary tables and descriptive statistics. Given its applicability across a variety of academic disciplines, this topic offers abundant opportunity for class discussion and activities. However, teaching PCA in an introductory class can be challenging due to the potential abstraction of multivariate datasets, and especially when students have a minimal background in statistics or data science. This method aims to help teachers bridge the gap between basic descriptive statistics and the more advanced concepts of PCA; this is done by disregarding mathematical optimization, while emphasizing the use of summary tables and the programming language R. The focus is on implementing this method in an introductory tertiary data science course; however, it may potentially be used in higher level courses, and across a variety of disciplines.
Published: 2024
Full Text: View/download PDF

12. Investigating the Interplay between Morphosyntax and Event Comprehension from the Perspective of Intersecting Object Histories

Author: Yanina Prystauka, Emma Wing, and Gerry T. M. Altmann
Abstract: In a series of sentence-picture verification studies we contrasted, for example, "… choose the balloon with "… inflate the balloon" and "… the inflated balloon" to examine the degree to which different representational components of event representation (specifically, the different object states entailed by the inflating event; minimally, the balloon in its uninflated and inflated states) are jointly activated after state-change verbs and past participles derived from them. Experiments 1 and 2 showed that the initial and end states are both activated after state-change verbs, but that the initial state is considerably less accessible after participles. Experiment 3 showed that intensifier adverbs (e.g., completely) before both state-change verbs and participles further modulate the accessibility of the initial state. And in Experiment 4, we ruled out the possibility that the initial state is accessible only because of the semantic overlap. We conclude that although state-change verbs activate representations of both the initial and end states of their event participants, their accessibility is graded, modulated by the morphosyntactic devices used to describe the event.
Published: 2024
Full Text: View/download PDF

13. Analysis of Regression Discontinuity Designs with a Binary Moderating Variable

Author: Jason Schoeneberger and Christopher Rhoads
Abstract: Regression discontinuity (RD) designs are increasingly used for causal evaluations. For example, if a student's need for a literacy intervention is determined by a low score on a past performance indicator and that intervention is provided to all students who fall below a cutoff on that indicator, an RD study can determine the intervention's main effects on subsequent outcomes with strong causal validity. However, the literature contains little guidance for conducting a moderation analysis within an RD context. Such moderation analyses are crucial in schools with very diverse and needy student populations because it may be rare to find a program that is equally effective with each of these subpopulations. In diverse schools, evaluators are interested in understanding not just whether a program generally works, but also in learning for whom it doesn't work. The current article focuses on moderation with a single binary variable. A simulation study compares: (1) different bandwidth selectors and (2) local polynomial regressions with interactions to local regressions on subsets of the data defined by values of the moderating variable. We find that existing bandwidth selectors optimized for main effects will choose bandwidths that are too small for moderation analysis. Additionally, choosing an optimal bandwidth for the subset regression approach may not be feasible for small to moderate sample sizes unless the moderator prevalence is near 0.5 and correlation with the assignment variable is small. We conclude that when sample sizes are small a global regression approach is likely to be preferred to utilizing bandwidth selectors optimized for main effects. The results from the simulation study are then followed by a illustrative practical example of applying these approaches in detecting important moderation effects in an evaluation study of the Accelerating Literacy For Adolescents Laboratory (ALFA LAB), a semester-long "extra help elective" class given to 9th graders entering high school with very low reading achievement. [This is the in-press version of an article published in "American Journal of Evaluation."]
Published: 2024

14. Organizational Unlearning: A Bibliometric Study and Visualization Analysis via CiteSpace

Author: Jiang Chen, Zobo Ongono Emilienne Charlotte, and Yana Yuan
Abstract: Coping with evolution and the changes it brings to the workplace remains a major concern for organizational leaders. This study explores the hotspots, trends, and future directions of the field of organizational unlearning to complement the extant research. A bibliometric analysis based on the literature collected by the Web of Science database was used to categorize or cluster different authors, their countries, institutions and different keywords (cooperation among authors, co-citation, co-occurrence of keywords), to discover their uniqueness or determine the relationship between them while using CiteSpace software to draw knowledge graphs and then results. This study advances the debate on sustainable knowledge acquisition in organizations and its interaction with organizational unlearning. It directly aids the process of radical change in workplace learning and training models and provides a clear view of the previous literature on organizational unlearning by laying a solid foundation for future research in the field of learning.
Published: 2024
Full Text: View/download PDF

15. Clustering-Based Knowledge Graphs and Entity-Relation Representation Improves the Detection of at Risk Students

Author: Balqis Albreiki, Tetiana Habuza, Nishi Palakkal, and Nazar Zaki
Abstract: The nature of education has been transformed by technological advances and online learning platforms, providing educational institutions with more options than ever to thrive in a complex and competitive environment. However, they still face challenges such as academic underachievement, graduation delays, and student dropouts. Fortunately, by harnessing student data from institution databases and online platforms, it becomes possible to predict the academic performance of individual students at an early stage. In this study, we utilized knowledge graphs (KG), clustering, and machine learning (ML) techniques on data related to students in the College of Information Technology at UAEU. To construct knowledge graphs and visualize students' performance at various checkpoints, we employed Neo4j-a high-performance NoSQL graph database. The findings demonstrate that incorporating clustered knowledge graphs with machine learning reduces predictive errors, enhances classification accuracy, and effectively identifies students at risk of course failure. Additionally, the utilization of visualization methods facilitates communication and decision-making within educational institutions. The combination of KGs and ML empowers course instructors to rank students and provide personalized learning interventions based on individual performance and capabilities, allowing them to develop tailored remedial actions for at-risk students according to their unique profiles.
Published: 2024
Full Text: View/download PDF

16. Development and Validation of a Short-Form Inventory to Identify Personality Types: The Personality Identity Estimator (PIE)

Author: Conti, Gary J.
Abstract: The use of personality inventories has been limited because of their cost and the length. To overcome these limitations, this study created the Personality Identity Estimator (PIE), an easy-to-use inventory to estimate personality types that can be used at no cost. PIE is a categorical inventory containing 12 items with 3 items for each of the 4 personality-type dimensions in Jungian theory. A sample of 1,104 was used to create PIE. Validity was established through multivariate analyses using data from the administration of the Myers-Briggs Type Indicator to 553 of the sample. Reliability was established by test-retest. Consequently, PIE is a new short-form instrument to estimate personality type following Jung's concept of personality types. PIE can be completed quickly, is easy to score and interpret, and can be used for self-assessment of personality types. PIE can be used to address individual differences, to increase self-awareness, and as an interactive instructional tool. Thus, PIE is a new valid and reliable tool that can be used in both instructional and research settings confidently and at no charge. Permission is granted to use the Personality Identity Estimator in practice and research. A printable copy of PIE is appended.
Published: 2023

17. Factors Influencing Science and Environmental Education Learning of Blind Students: A Case of Primary School for the Blind in Thailand

Author: Prasertpong, Phanuwat, Charmondusit, Kitikorn, Taecharungroj, Viriya, Rawang, Wee, Suwan, Sumit, and Woraphong, Seree
Abstract: The main objective of this research was to study the factors influencing the science and environment of education program for blind students at the elementary level. This research used mixed methods (quantitative and qualitative approaches), specifically a questionnaire survey was conducted to better understand the current situation on Science and Environmental Education Learning among 192 blind students in 14 primary schools for the blind in Thailand and in-depth interviews with the directors, science and Environmental teachers, and students' parents, 30 interviewees in total. The data were collected and analyzed into frequency and percentage (and K-means clustering) using SPSS software. The research findings from data collection from 192 blind students illustrated that the importance of facilitator that involve teaching materials (43%) was support appropriate to Science and Environmental Education teaching materials for blind students, 34% students had a good relationship with their classmates and teachers. A total of (32%) were provided more Braille textbooks for Science and Environmental Education teaching. K-means cluster analysis showed four clusters of science and environmental education learning blind students. The study concluded that the factors influencing effectiveness of science and environmental education learning of blind students consisted of facilitators, the creation of scientific learning processes, media and technology as the medium for communicating scientific knowledge, appropriate curriculum for blind students, and cognitive abilities of blind students in physical, mental, intelligence, and emotion terms.
Published: 2023

18. Onliners versus On-Grounders in Computer and Information Systems Courses in Higher Education: A Two-Step Cluster Analysis

Author: Alan Peslak, Lisa Kovalchick, Wenli Wang, and Paul Kovacs
Abstract: Are students who prefer online education different from those who prefer on-ground education, and how? This is an important question because educational institutions need to better understand student segmentations. This research examined 251 survey responses from students enrolled in Computer Information Systems courses at three universities over five years (2016-2021) and reviewed student attitudes, perceived skills, and their sociological characteristics. Through two-step cluster analysis unsupervised machine learning, two distinct clusters of students emerged, namely Onliners and On-grounders. The top nine out of the eleven group characteristics for Onliners are: select more online courses, regard online instruction as effective, work better without supervision, rely less on classroom interactions in learning, value convenience, can prioritize, are better organized, better prepared, and older. By understanding these group characteristics, educational institutions can make better decisions in policy making, resources allocation, and student recruitment/retention.
Published: 2023

19. Dimension-Grouped Mixed Membership Models for Multivariate Categorical Data

Author: Yuqi Gu, Elena A. Erosheva, Gongjun Xu, and David B. Dunson
Abstract: Mixed Membership Models (MMMs) are a popular family of latent structure models for complex multivariate data. Instead of forcing each subject to belong to a single cluster, MMMs incorporate a vector of subject-specific weights characterizing partial membership across clusters. With this flexibility come challenges in uniquely identifying, estimating, and interpreting the parameters. In this article, we propose a new class of "Dimension-Grouped" MMMs (Gro-M[superscript 3]s) for multivariate categorical data, which improve parsimony and interpretability. In Gro-M[superscript 3]s, observed variables are partitioned into groups such that the latent membership is constant for variables within a group but can differ across groups. Traditional latent class models are obtained when all variables are in one group, while traditional MMMs are obtained when each variable is in its own group. The new model corresponds to a novel decomposition of probability tensors. Theoretically, we derive transparent identifiability conditions for both the unknown grouping structure and model parameters in general settings. Methodologically, we propose a Bayesian approach for Dirichlet Gro-M[superscript 3]s to inferring the variable grouping structure and estimating model parameters. Simulation results demonstrate good computational performance and empirically confirm the identifiability results. We illustrate the new methodology through applications to a functional disability survey dataset and a personality test dataset.
Published: 2023

20. MOOCs as a Research Agenda: Changes over Time

Author: Zhang, Shunan, Che, ShaoPeng, Nan, Dongyan, and Kim, Jang Hyun
Abstract: MOOCs (massive open online courses) have attracted considerable attention from researchers. Fueled by constant change and developments in educational technology, the trends of MOOCs have varied greatly over the years. To detect and visualize the developments and changes in MOOC research, 4,652 articles published between 2009 and 2021 were retrieved from Web of Science and Scopus with the aid of CiteSpace. This study sought to explore the number of publications, co-citation network, cluster analysis, timeline analysis, burstness analysis, and dual-map overlays based on co-citation relationships. The first finding was that the number of publications on MOOCs had increased consistently, and grew especially quickly between 2013 and 2015. Second, the main topic of the top 10 co-cited studies revolved around the problem of learner continuance. Third, blended programs, task-technology fit, and comparative analysis have emerged as popular subjects. Fourth, the development of MOOC research has followed distinct phases, with 2009 to 2012 the starting phase, 2013 to 2015 the high growth phase, 2016 to 2018 the plateau phase, and 2019 to 2021 another peak phase. Lastly, both cluster analysis and dual-map overlays provided empirical evidence of cross-disciplinary research. Our findings provided an in-depth and dynamic understanding of the development and evolution of MOOC research and also proposed novel ideas for future studies.
Published: 2022

21. Knowledge Map Construction Based on Association Rule Mining Extending with Interaction Frequencies and Knowledge Tracking for Rules Cleaning

Author: Jing Fang, Xiong Xiao, Xiuling He, Yangyang Li, Huanhuan Yuan, and Xiaomin Jiao
Abstract: Knowledge maps are teaching tools that can promote deeply learning and avoid knowledge loss by helping students plan learning paths. Mining potential association rules of concepts from student exercise data was a common method to construct knowledge maps automatically. While manual conditions should be set to filter the association rules future to improve the accuracy of knowledge maps, which made the construction of the knowledge map can not automatic totally. So, the study proposed a knowledge map construction method that combined knowledge tracking and association rule mining expanding with interaction frequencies based on exercise data to achieve rules cleaning automatically. The method first predicted students' knowledge mastery sequences by a deep knowledge tracking model and discovered clustering relations to represent potential structures between concepts by fuzzy cluster analysis. Meanwhile, the method investigated association rule mining expanding with interaction frequencies to discover association rules between concepts. Finally, the clustering relations were used to filter the mined association rules automatically. To verify the effectiveness of the method, we constructed a knowledge map based on 34,350 online exercise data of 117 students in a computer programming course. Experimental results proved that the map was valid. Our implementations are available at https://github.com/xxdmw/FPGF-master.
Published: 2024
Full Text: View/download PDF

22. Utilizing Visuals and Information Technology in Mathematics Classrooms. Advances in Educational Technologies and Instructional Design (AETID) Book Series

Author: Hiroto Namihira and Hiroto Namihira
Abstract: Academic scholars face a difficult challenge when attempting to grasp the intricate world of mathematics. The complexity of mathematical concepts often lies hidden beneath layers of formulas and procedures, obscuring their true essence. Traditional educational resources often fall short in conveying the profound meaning behind these concepts, leaving students and scholars feeling overwhelmed and irritated. Furthermore, the integration of information technology (IT) with mathematics remains an under explored frontier, preventing the development of logical insights from arbitrary initial conditions. As a result, there is an urgent need for a solution that can bridge these gaps and offer an innovative approach to learning mathematics. "Utilizing Visuals and Information Technology in Mathematics Classrooms" is a comprehensive and innovative solution to the challenges faced by academic scholars in the field of mathematics. This book takes a bold step in addressing these issues by offering a unique approach -- visualization. By harnessing the power of visual representation, we transform complex mathematical concepts into easily understandable images, making the transition from initial states to final states of these crucial ideas visually intuitive. "Utilizing Visuals and Information Technology in Mathematics Classrooms" not only simplifies the learning process but also sets the stage for a paradigm shift by effectively merging education and IT, creating a forward-thinking approach that is poised to reshape the world of academia.
Published: 2024
Full Text: View/download PDF

23. A Scalable Parallel Processing Design for the Data Washing Machine: An Unsupervised Entity Resolution System

Author: Nicholas Kofi Akortia Hagan
Abstract: Entity Resolution (ER) has been one of the bedrocks in the creation of information systems by ensuring ambiguous entities are identified and resolved by linking. One common design approach of traditional ER systems is to run in single-threaded mode, which makes the system prone to out-of-memory error when processing larger datasets. The Data Washing Machine (DWM) as a proof-of-concept of an unsupervised cluster ER system is indifferent from this common design bottleneck. The original prototype design of the DWM requires shared memory tables and dictionaries of tokens, and its single-threaded nature makes it not scalable, hence not viable for real-world application. Distributed and parallel programming frameworks such as Hadoop MapReduce (MR) and Apache Spark's Resilient Distributed Datasets (RDD) are great fits for scaling ER systems since the comparison of equivalent pairs is independent and can occur in parallel. This dissertation aims at designing and developing a Distributed DWM by adopting the parallel and distributed capability of Hadoop MR and RDD. An initial prototype (HadoopDWM) was developed using Hadoop MR, which was further refactored into SparkDWM using RDD. Experiment results show that HadoopDWM and SparkDWM get the same results as the legacy DWM using optimal starting parameters. A scalability test conducted using 203 million records confirms that HadoopDWM and SparkDWM are scalable, with a total execution time of 7 and 3 hours, respectively. [The dissertation citations contained here are published with the permission of ProQuest LLC. Further reproduction is prohibited without permission. Copies of dissertations may be obtained by Telephone (800) 1-800-521-0600. Web page: http://www.proquest.com/en-US/products/dissertations/individuals.shtml.]
Published: 2024

24. Assessment of Students' Academic Performance in Clothing and Textile in Tertiary Institutions Using ANN and ANOVA Techniques

Author: Juliana Ego Azonuche, Juliet Obiageli Okoruwa, Comfort Ukrajit Sonye, and Gbenga Samuel Oladosu
Abstract: The performance history of 277 students in clothing and textile from two tertiary institutions in southern Nigeria was studied by artificial neural networks (ANN) and analysis of variance (ANOVA) in terms of institution, gender, ordinary level (O-level) qualification, marital status, and age. The study was guided by five research questions and five hypotheses tested at the 0.05 level of significance. ANOVA is utilised to identify significant differences in academic performance among groups formed by the aforementioned factors. The most significant factors identified through ANOVA are used as input features for the ANN model. The dataset for the ANN model development was randomly distributed into three groups training (80%), validation (10%), and testing (10%). Hypothesis testing indicates significant differences in students' academic performance between institutions and based on O-level qualifications. Further research can build upon these findings to enhance the quality of education in the field of clothing and textiles.
Published: 2024
Full Text: View/download PDF

25. The Application of Big Data in the Management of Ideological and Political Education in the Development of Education Network

Author: Huichao Li and Dan Li
Abstract: Based on a brief analysis of the current situation of university education management and research on intelligent algorithms, this article constructs a university education management system based on big data. For the clustering and prediction modules in higher education management, corresponding algorithms are used for optimization design. A fuzzy clustering algorithm based on entropy weight is proposed to address the shortcomings of the C-means clustering algorithm. This algorithm adds weighting coefficients on the basis of improvement and continuously updates the clustering centers. The prediction module uses Apriori algorithm to map and compress the target transaction database, reduces the number of candidate item sets through pruning process, and designs simulation experiments to measure the performance of the algorithm. The simulation results show that the clustering results of this algorithm are closer to the actual clustering situation, with shorter running time and better algorithm performance.
Published: 2024
Full Text: View/download PDF

26. Shared Shrinkage Horseshoe Priors for Dirichlet-Tree Multinomial Regression

Author: Erin W. Post
Abstract: Multivariate count data is ubiquitous in many areas of research including the physical, biological, and social sciences. These data are traditionally modeled with the Dirichlet Multinomial distribution (DM). A new, more flexible Dirichlet-Tree Multinomial (DTM) model is gaining in popularity. Here, we consider Bayesian DTM regression models. Our contribution is to introduce a novel shared shrinkage prior for use on the regression coefficients. The proposed prior enables branches in the Dirichlet trees to borrow information from one another, which encourages similar levels of shrinkage on covariates throughout the model. We focus on modeling multivariate count data in social and community settings like educational outcomes, crime rates, and voting data. In these setting, the interpretation of the regression coefficients is of particular interest. With that in mind, we pay special attention to both the interpretation of model parameters and the process of selecting a tree structure. A simulation study demonstrates the benefits of our proposed shared shrinkage prior against existing alternatives. We show the usefulness of the proposed prior in the analyses of two real datasets. The first dataset examines connections between household conditions and post-graduation intentions for high school seniors in Iowa's public school districts. The second dataset looks at the relationship between community characteristics and instances of different categories of crimes as reported by the FBI. [The dissertation citations contained here are published with the permission of ProQuest LLC. Further reproduction is prohibited without permission. Copies of dissertations may be obtained by Telephone (800) 1-800-521-0600. Web page: http://www.proquest.com/en-US/products/dissertations/individuals.shtml.]
Published: 2024

27. Nonparametric Identification of Causal Effects in Clustered Observational Studies with Differential Selection

Author: Ting Ye, Ted Westling, Lindsay Page, and Luke Keele
Abstract: The clustered observational study (COS) design is the observational study counterpart to the clustered randomized trial. In a COS, a treatment is assigned to intact groups, and all units within the group are exposed to the treatment. However, the treatment is non-randomly assigned. COSs are common in both education and health services research. In education, treatments may be given to all students within some schools but withheld from all students in other schools. In health studies, treatments may be applied to clusters such as hospitals or groups of patients treated by the same physician. In this manuscript, we study the identification of causal effects in clustered observational study designs. We focus on the prospect of differential selection of units to clusters, which occurs when the units' cluster selections depend on the clusters' treatment assignments. Extant work on COSs has made an implicit assumption that rules out the presence of differential selection. We derive the identification results for designs with differential selection and that contexts with differential cluster selection require different adjustment sets than standard designs. We outline estimators for designs with and without differential selection. Using a series of simulations, we outline the magnitude of the bias that can occur with differential selection. We then present two empirical applications focusing on the likelihood of differential selection. [This is the online version of an article published in "Journal of the Royal Statistical Society, Series A: Statistics in Society"]
Published: 2024
Full Text: View/download PDF

28. Approximate Balancing Weights for Clustered Observational Study Designs

Author: Eli Ben-Michael, Lindsay Page, and Luke Keele
Abstract: In a clustered observational study, a treatment is assigned to groups and all units within the group are exposed to the treatment. We develop a new method for statistical adjustment in clustered observational studies using approximate balancing weights, a generalization of inverse propensity score weights that solve a convex optimization problem to find a set of weights that directly minimize a measure of covariate imbalance, subject to an additional penalty on the variance of the weights. We tailor the approximate balancing weights optimization problem to the clustered observational study setting by deriving an upper bound on the mean square error and finding weights that minimize this upper bound, linking the level of covariate balance to a bound on the bias. We implement the procedure by specializing the bound to a random cluster-level effects model, leading to a variance penalty that incorporates the signal-to-noise ratio and penalizes the weight on individuals and the total weight on groups differently according to the the intra-class correlation. [This is the online first version of an article published "Statistics in Medicine."]
Published: 2024
Full Text: View/download PDF

29. Multivariate Knowledge Tracking Based on Graph Neural Network in ASSISTments

Author: Zhenchang Xia, Nan Dong, Jia Wu, and Chuanguo Ma
Abstract: As an excellent means of improving students' effective learning, knowledge tracking can assess the level of knowledge mastery and discover latent learning patterns based on students' historical learning evaluation of related questions. The advantage of knowledge tracking is that it can better organize and adjust students' learning plans, provide personalized guidance, and thus, achieve the purpose of artificial intelligence-assisted education. However, existing methods, for instance, convolutional knowledge tracing, lacking consideration of graph structure and multivariate time-series prediction, result in poor prediction accuracy. Inspired by recent successes of the graph neural network (GNN), we present a novel multivariate graph knowledge tracking (MVGKT) framework to address these limitations. Specifically, a multivariate time-series knowledge tracking system based on spatio-temporal GNN is designed to model student learning trajectories in different spatial and temporal dimensions and capture both temporal dependencies and interstudent correlations. MVGKT incorporates a gate recurrent unit attentive mechanism and graph Fourier transform, discrete Fourier transform, and graph convolution network to increase the predictive accuracy of the student performances. In addition, we design a question difficulty extraction system to obtain information on the difficulty of the questions, and thus, enhance the data features. Numerous experiments on the ASSISTments dataset have demonstrated that MVGKT is superior to existing knowledge-tracking methods on four metrics and has shown that our approach can enhance the predictive accuracy of student performance.
Published: 2024
Full Text: View/download PDF

30. From Aggregations to Multimethod Case Configurations: Case Diversity in Quantitative Analysis When Explaining COVID-19 Fatalities

Author: Philip Haynes and David Alemna
Abstract: Three quantitative methods are compared for their ability to understand different COVID-19 fatality ratios in 33 OECD countries. Linear regression provides a limited overview without sensitivity to the diversity of cases. Cluster Analysis and Dynamic Patterns Synthesis (DPS) gives scrutiny to the granularity of case similarities and differences, and reveals case exceptions. Qualitative Comparative Analysis (QCA) develops causal theory about what conditions are sufficient for explaining outcomes by using robust and transparent conventions. Configurational case-based methods offer important advantages over inferential statistics when there is a need to focus on diversity in small n. These techniques can be combined as multi-methods. DPS and QCA can be used concurrently to aid research insights. These methods are also strengthened by additional qualitative evidence about the cases.
Published: 2024
Full Text: View/download PDF

31. Item Response Theory Models for Difference-in-Difference Estimates (And Whether They Are Worth the Trouble)

Author: James Soland
Abstract: When randomized control trials are not possible, quasi-experimental methods often represent the gold standard. One quasi-experimental method is difference-in-difference (DiD), which compares changes in outcomes before and after treatment across groups to estimate a causal effect. DiD researchers often use fairly exhaustive robustness checks to make sure the assumptions of the DiD are met. However, less thought is often put into the approach to score item responses from the outcome measure used. For example, surveys are often scored by adding up the item responses to produce sum scores, and achievement tests often rely on scores produced by test vendors, which frequently employ a unidimensional item response theory (IRT) scoring model that implicitly assumes control and treatment participants are exchangeable (i.e., that they come from the same distribution). In this study, several IRT models that parallel the DiD design in terms of groups and timepoints are presented, and their performance is examined. Results indicate that using a scoring approach that parallels the DiD study design reduces bias and improves power, though these approaches can also lead to increased Type I error rates.
Published: 2024
Full Text: View/download PDF

32. Active-Learning Class Activities and Shiny Applications for Teaching Support Vector Classifiers

Author: Qing Wang and Xizhen Cai
Abstract: Support vector classifiers are one of the most popular linear classification techniques for binary classification. Different from some commonly seen model fitting criteria in statistics, such as the ordinary least squares criterion and the maximum likelihood method, its algorithm depends on an optimization problem under constraints, which is unconventional to many students in a second or third course in statistics or data science. As a result, this topic is often not as intuitive to students as some of the more traditional statistical modeling tools. In order to facilitate students' mastery of the topic and promote active learning, we developed some in-class activities and their accompanying Shiny applications for teaching support vector classifiers. The designed course materials aim at engaging students through group work and solidifying students' understanding of the algorithm via hands-on explorations. The Shiny applications offer interactive demonstration of the changes of the components of a support vector classifier when altering its determining parameters. With the goal of benefiting the broader statistics and data science education community, we have made the developed Shiny applications publicly available. In addition, a detailed in-class activity worksheet and a real data example are also provided in the online supplementary materials.
Published: 2024
Full Text: View/download PDF

33. Closing 'Reporting' Gaps: A Comparison of Methods for Estimating Unreported Subgroup Achievement on NAEP

Author: David Bamat
Abstract: The National Assessment of Educational Progress (NAEP) program only reports state-level subgroup results if it samples at least 62 students identifying with the subgroup. Since some subgroups constitute small proportions of many states' general student populations, these minority subgroups are seldom sufficiently sampled to meet this sample size requirement. Consequently, education researchers and policymakers are regularly left without a comprehensive understanding of how states are supporting the learning and achievement of different subgroups of students, including underserved subgroups. Using grade 8 mathematics results in 2015, this study compares the performance of two separate techniques in predicting mean subgroup achievement on NAEP. Results demonstrate that a small area estimation technique, known as the Fay-Herriot (FH) model, represents a promising approach. Follow-up research should employ the FH technique with math achievement data from other testing years, as well as data from NAEP reading to evaluate the technique's generalizability.
Published: 2024
Full Text: View/download PDF

34. Bayesian Structural Equation Models of Correlation Matrices

Author: James Ohisei Uanhoro
Abstract: We present a method for Bayesian structural equation modeling of sample correlation matrices as correlation structures. The method transforms the sample correlation matrix to an unbounded vector using the matrix logarithm function. Bayesian inference about the unbounded vector is performed assuming a multivariate-normal likelihood, with a mean based on the transformed model-implied correlation matrix, and a covariance assumed to be of known form. Using Monte Carlo simulation, we examine the performance of the method with normal and ordinal indicators, as well as the capacity of the method to estimate models that account for misspecification. The performance of the approach is often adequate suggesting that the proposed method can be used for Bayesian analysis of correlation structures. We conclude with a discussion of potential applications of the approach, as well as future directions needed to further develop the method.
Published: 2024
Full Text: View/download PDF

35. Circumplex Models with Multivariate Time Series: An Idiographic Approach

Author: Dayoung Lee, Guangjian Zhang, and Shanhong Luo
Abstract: The circumplex model posits a circular representation of affect and some personality traits. There is an increasing need to examine the viability of the circumplex model with multivariate time series data collected on the same individuals due to the development of new data collection methods such as smartphone applications and wearable sensors. Estimating the circumplex model with time series data is more complex than with cross-sectional data because scores at nearby time points tend to be correlated. We adapt Browne's circumplex model to accommodate time series data. We illustrate the proposed method with an empirical data set of daily affect ratings of an individual over 70 days. We conducted a simulation study to explore the statistical properties of the proposed method. The results show that the method provides more satisfactory confidence intervals and test statistics than a method that treats time series data as if they were cross-sectional data.
Published: 2024
Full Text: View/download PDF

36. Benefits of Doing Generalizability Theory Analyses within Structural Equation Modeling Frameworks: Illustrations Using the Rosenberg Self-Esteem Scale

Author: Walter P. Vispoel, Hyeri Hong, and Hyeryung Lee
Abstract: Although generalizability theory (GT) designs typically are analyzed using analysis of variance (ANOVA) procedures, they also can be integrated into structural equation models (SEMs). In this tutorial, we review basic concepts for conducting univariate and multivariate GT analyses and demonstrate advantages of doing such analyses within SEM frameworks using multi-occasion data from the Rosenberg Self-Esteem Scale. We show how GT-SEMs can reproduce variance components for both relative and absolute error obtained from ANOVA models, estimate effects of changes made to measurement procedures and universes of generalization, incorporate estimation methods to correct for scale coarseness, represent essential tau-equivalent or congeneric relationships, include additional method factors for negatively and positively worded items, incorporate bifactor designs, allow for formal tests of model fit when warranted, and derive Monte Carlo confidence intervals for key parameters of interest. We provide code for conducting the demonstrated analyses using several statistical packages in extended online Supplemental Material.
Published: 2024
Full Text: View/download PDF

37. Analyzing Multivariate Generalizability Theory Designs within Structural Equation Modeling Frameworks

Author: Walter P. Vispoel, Hyeryung Lee, and Hyeri Hong
Abstract: We demonstrate how to analyze complete multivariate generalizability theory (GT) designs within structural equation modeling frameworks that encompass both individual subscale scores and composites formed from those scores. Results from numerous analyses of observed scores obtained from respondents who completed the recently updated form of the Big Five Inventory (BFI-2) revealed that the "lavaan" SEM package in R produced results virtually identical to those obtained from the "mGENOVA" package, which historically has served as the gold standard for conducting multivariate GT analyses. We further extended "lavaan" analyses beyond what "mGENOVA" allows to produce Monte Carlo based confidence intervals for key GT parameters and correct score consistency and correlational indices for effects of scale coarseness characteristic of binary and ordinal data. Our comprehensive online Supplemental Material includes code for performing all illustrated analyses using "lavaan" and "mGENOVA."
Published: 2024
Full Text: View/download PDF

38. How to Evaluate Causal Dominance Hypotheses in Lagged Effects Models

Author: Chuenjai Sukpan and Rebecca M. Kuiper
Abstract: The (Random Intercept) Cross-Lagged Panel Model ((RI-)CLPM) is increasingly used in psychology and related fields to assess the longitudinal relationship of two or more variables on each other. Researchers are interested in the question which of the lagged effects is causally dominant receives considerable attention. However, currently used methods do not allow for the evaluation of causal dominance hypotheses. This paper will show the performance of the Generalized Order-Restricted Information Criterion Approximation (GORICA), an extension of Akaike's Information Criterion (AIC), in the context of causal dominance hypotheses using a simulation study. The GORICA proves to be an adequate method to evaluate causal dominance in lagged effects models.
Published: 2024
Full Text: View/download PDF

39. Recommended Practices in Latent Class Analysis Using the Open-Source R-Package tidySEM

Author: C. J. Van Lissa, M. Garnier-Villarreal, and D. Anadria
Abstract: Latent class analysis (LCA) refers to techniques for identifying groups in data based on a parametric model. Examples include mixture models, LCA with ordinal indicators, and latent class growth analysis. Despite its popularity, there is limited guidance with respect to decisions that must be made when conducting and reporting LCA. Moreover, there is a lack of user-friendly open-source implementations. Based on contemporary academic discourse, this paper introduces recommendations for LCA which are summarized in the SMART-LCA checklist: Standards for More Accuracy in Reporting of different Types of Latent Class Analysis. The free open-source R-package package "tidySEM" implements the practices recommended here. It is easy for beginners to adopt thanks to user-friendly wrapper functions, and yet remains relevant for expert users as its models are integrated within the "OpenMx" structural equation modeling framework and remain fully customizable. The Appendices and "tidySEM" package vignettes include tutorial examples of common applications of LCA.
Published: 2024
Full Text: View/download PDF

40. The Influence of Internet Environment Health on College Pupils' Ideological and Moral Education and Its Promotion

Author: Juanjuan Niu
Abstract: The internet, which is constantly advancing in technology, together with the rapidly changing internet communication technology terminals, has formed a new internet media, which has penetrated into all fields of human material life and spiritual life. This article proposes a design scheme for optimizing the impact of internet environment health on college pupils' ideological and moral education and the promotion path. It summarizes the influencing factors of contemporary college pupils' ideological and moral education through cluster analysis, optimizes the factors using Apriori arithmetic in data mining, and realizes the promotion of the path to solve problems. Finally, it carries out simulation testing and analysis. In order to promote the effective development of college pupils' ideological and political education, we should strengthen the internet management, purify the internet environment, strengthen the construction of "red websites," and enhance their attractiveness.
Published: 2024
Full Text: View/download PDF

41. Establishment and Practice of Physical Education Evaluation Using Grey Cluster Analysis under the Data Background

Author: Jinxin Jiang and Sang Keon Yoo
Abstract: This article uses scientific methods and means to evaluate the value, elements, and processes of physical education, consistent with preset evaluation indicators through sample calculation, and then derives the characteristics of decision-making, the objectivity of indicators, the order, and other characteristics of the process. The authors have analyzed the main problems in current physical-education evaluation and its future reform and development trends under the requirements of quality education.
Published: 2024
Full Text: View/download PDF

42. The Use of Latent Class Analysis (LCA) to Assess Children's Movement Behaviours Measured by Accelerometer and Self-Report

Author: Isabella Toledo Caetano, Valter Paulo Neves Miranda, Fernanda Rocha de Faria, Cheryl Anne Howe, and Paulo Roberto dos Santos Amorim
Abstract: Latent Class Analysis (LCA) is a statistical method that can help researchers interested in better understanding the movement behaviors (MB) of children, based on the analysis of the level of physical activity (PA) and sedentary time (ST). This study aimed to evaluate and compare two models LCA (one for the accelerometer and one for the 24-hour recall) that represent the MB of children. A cross-sectional study involving 101 10-year-old Brazilian children. The classes were based on vigorous PA (VPA), moderate PA (MPA), light PA (LPA), and sedentary time (ST). To assess these behaviors, a 24-hour recall and an accelerometer were used. The accelerometer was used during four days. The time spent on each of the MB was categorized dichotomously, based on the 25th and 75th percentiles. Thus, "adequate" times were considered when the ST was below 25thP, the LPA and MPA were above 25thP and the VPA was above 75thP. LCA was used to model the variable "MB." For each latent class model (accelerometer and recall), two classes were found: "Adequate MB" and "Inadequate MB." Regardless of the method (accelerometer or self-report), the values of the "Inadequate MB" class had higher prevalence. Self-report predicted higher PA and lower ST compared to the accelerometer. The model based on accelerometry revealed that girls were 2.11 times more likely to belong to the "Inadequate MB" class when compared to boys. LCA was a multivariate statistical method that allowed the integrated evaluation of parameters that represent the MB analyzed by device-based and self-report methods.
Published: 2024
Full Text: View/download PDF

43. Using Functional Clustering to Diagnose Person Misfit

Author: Kyle T. Turner and George Engelhard
Abstract: The purpose of this study is to demonstrate clustering methods within a functional data analysis (FDA) framework for identifying subgroups of individuals that may be exhibiting categories of misfit. Person response functions (PRFs) estimated within a FDA framework (FDA-PRFs) provide graphical displays that can aid in the identification of persons that have responded unexpectedly to items comprising an achievement test. Typical person fit statistics are also useful for detecting unexpected response patterns, but they do not provide insight into the underlying behaviors responsible for those responses. However, different responding behaviors tend to produce FDA-PRFs of different shapes, and may provide additional information regarding the reasons for misfit. Functional clustering methods are useful for categorizing respondents into subgroups based on the shapes of their FDA-PRFs. In this study, a small simulation illustrates the potential of clustering FDA-PRFs for identifying persons displaying common types of responding behaviors. The methodology is also applied to data from a high school biology assessment and a mathematics achievement test. Clustering FDA-PRFs offers a promising methodology for operationalizing person fit evaluations in large-scale assessments, and may be a valuable step in person fit assessment when used in conjunction with traditional indices of psychometric quality.
Published: 2024
Full Text: View/download PDF

44. Fair Multivariate Adaptive Regression Splines for Ensuring Equity and Transparency

Author: Parian Haghighat, Denisa Gandara, Lulu Kang, and Hadis Anahideh
Abstract: Predictive analytics is widely used in various domains, including education, to inform decision-making and improve outcomes. However, many predictive models are proprietary and inaccessible for evaluation or modification by researchers and practitioners, limiting their accountability and ethical design. Moreover, predictive models are often opaque and incomprehensible to the officials who use them, reducing their trust and utility. Furthermore, predictive models may introduce or exacerbate bias and inequity, as they have done in many sectors of society. Therefore, there is a need for transparent, interpretable, and fair predictive models that can be easily adopted and adapted by different stakeholders. In this paper, we propose a fair predictive model based on multivariate adaptive regression splines (MARS) that incorporates fairness measures in the learning process. MARS is a non-parametric regression model that performs feature selection, handles non-linear relationships, generates interpretable decision rules, and derives optimal splitting criteria on the variables. Specifically, we integrate fairness into the knot optimization algorithm and provide theoretical and empirical evidence of how it results in a fair knot placement. We apply our "fair"MARS model to real-world data and demonstrate its effectiveness in terms of accuracy and equity. Our paper contributes to the advancement of responsible and ethical predictive analytics for social good. [This paper was presented at an Association for the Advancement of Artificial Intelligence conference.]
Published: 2024

45. Building Intrapersonal Competencies in the First-Year Experience: Utilizing Random Forest, Cluster Analysis, and Linear Regression to Identify Students' Strengths and Opportunities for Institutional Improvement

Author: Bresciani Ludvik, Marilee, Zhang, Shiming, Kahn, Sandra, Potter, Nina, Richardson-Gates, Lisa, Schellenberg, Stephen, Saiki, Robyn, Subedi, Nasima, Harmata, Rebecca, Monzon, Rey, Timm, Randy, Stronach, Jeanne, and Jost, Anna
Abstract: In seeking to close equity gaps within a first-year student seminar course, course designers leveraged emerging research on intrapersonal competency cultivation, known to significantly predict student success across diverse students (NAS, 2018). After re-designing the course to intentionally cultivate specific intrapersonal competencies, researchers set out to explore how well the course closed historical institutional equity gaps as measured by end-of-term GPA. Over four years of data collection and course refinement, traditional regression analysis were useful for informing course improvements that resulted in the closing of some equity gaps. However, students were still being placed on academic probation and certain identities of students were over-represented in academic probation numbers. As such, the team utilized random forest, cluster analysis, and then regression analysis that allowed them to focus improvement efforts on a cluster of students that would have otherwise remained unidentified through traditional analysis measures.
Published: 2022

46. Using the Model of Benchmarking of Educational Services in a Socially Responsible Education-Innovation Cluster during the COVID-19 Pandemic

Author: Shcherbak, Valeriia, Ganushchak-Yefimenko, Liudmyla, Nifatova, Olena, Shatska, Zoryna, Radionova, Natalia, Danko, Yuriy, and Yatsenko, Valent?na
Abstract: Purpose of the study is to substantiate the feasibility of using the model of benchmarking of educational services in the socially responsible educational-innovative cluster in the context of COVID-19 pandemic. Specifically, the authors focus on justify a "nuclear" approach to the cluster formation. The process-oriented benchmarking model was used to apply best practices of providing higher educational service in the context of COVID-19 pandemic. The method of factor analysis was used to determine the impact of each of the 4P subsystems of benchmarking. Using the method of benchmarking makes it possible to develop a final competitive product in the context of the COVID-19 pandemic -- an educational service for all industry and territorial stakeholders. In the light of COVID-19 pandemic, the formation of socially responsible educational-innovative clusters in Ukraine is one of the most promising and effective trends to modernize the provision of educational service. The uniqueness of the educational service is created by strengthening and synergizing the competitive advantages and competencies of all participants in the educational cluster: teachers, students, employers, research staff, the local community.
Published: 2022

47. Investigating Online Learners' Knowledge Structure Patterns by Concept Maps: A Clustering Analysis Approach

Author: He, Xiuling, Fang, Jing, Cheng, Hercy N. H., Men, Qibin, and Li, Yangyang
Abstract: A deep understanding of the learning level of online learners is a critical factor in promoting the success of online learning. Using knowledge structures as a way to understand learning can help analyze online students' learning levels. The study used concept maps and clustering analysis to investigate online learners' knowledge structures in a flipped classroom's online learning environment. Concept maps (n = 359) constructed by 36 students during one semester (11 weeks) through the online learning platform were collected as analysis objects of learners' knowledge structures. Clustering analysis was used to identify online learners' knowledge structure patterns and learner types, and a non-parametric test was used to analyze the differences in learning achievement among learner types. The results showed that (1) there were three online learners' knowledge structure patterns of increasing complexity, namely, spoke, small-network, and large-network patterns. Moreover, online learners with novice status mostly had spoke patterns in the context of flipped classrooms' online learning. (2) Two types of online learners were found to have different distributions of knowledge structure patterns, and the complex knowledge structure type of learners exhibited better learning achievement. The study explored a new way for educators to analyze knowledge structures by data mining automatically. The findings provide evidence in the online learning context for the relationship between complex knowledge structures and better learning achievement while suggesting the existence of inadequate knowledge preparedness for flipped classroom learners without a special instructional design.
Published: 2023
Full Text: View/download PDF

48. Explained: Artificial Intelligence for Propensity Score Estimation in Multilevel Educational Settings

Author: Collier, Zachary K., Zhang, Haobai, and Liu, Liu
Abstract: Although educational research and evaluation generally occur in multilevel settings, many analyses ignore cluster effects. Neglecting the nature of data from educational settings, especially in non-randomized experiments, can result in biased estimates with long-term consequences. Our manuscript improves the availability and understanding of artificial neural networks, an underutilized method trending in other disciplines. This method also shows promise for dealing with challenges faced by educational researchers, such as analyzing clustered data. Therefore, we simulated data to generalize the potential benefits of artificial neural networks to different data types. We also compared artificial neural networks to more familiar methods and investigated the time it demanded to perform each technique. Hence, readers can decide when it may be more appropriate to use one method instead of another.
Published: 2022

49. Differential Item Functioning across Gender with MIMIC Modeling: PISA 2018 Financial Literacy Items

Author: Saaatcioglu, Fatima Munevver
Abstract: The aim of this study is to investigate the presence of DIF over the gender variable with the latent class modeling approach. The data were collected from 953 students who participated in the PISA 2018 8th-grade financial literacy assessment in the USA. Latent Class Analysis (LCA) approach was used to identify the latent classes, and the data fit the three-class model better in line with fit indices. In order to obtain more information about the characteristics of the emerging classes, uniform and non-uniform DIF sources were identified by using the Multiple Indicator Multiple Causes (MIMIC) model. The findings are very important in terms of contributing to the interpretation of latent classes. According to the results, the gender variable was a source of DIF for latent classes. It is important to include direct effects by gathering unbiased estimates for the measurement and structural parameters. Disregarding these effects can lead to incorrect identification of implicit classes. A sample application of MIMIC model was performed in a latent class framework with a stepwise approach in this study.
Published: 2022

50. An Examination of the Studies on STEM in Education: A Bibliometric Mapping Analysis

Author: Tas, Nurullah and Bolat, Yusuf Islam
Abstract: This research aims to propose a bibliometric map of studies on the use of STEM in education. This study used publication co-citation analysis, author co-citation analysis, and word frequency analysis methods to reveal the structure and transformation of STEM literature. Descriptive data such as the distribution of studies in the field by country, institution, and time were obtained from the Web of Science (WoS) database. We used the RStudio program for bibliometric analysis. The International Journal of STEM Education is the most widely published. Journal of Science Education and Technology received the most citations. Guzey S.S. is the author with the most publications on the subject. Capraro M.M. most cited author. The university that publishes the most is Purdue University. The USA is the country with the highest number of publications. The paper by Blickenstaff titled Gender and Education in 2005 has been cited the most worldwide. The research by Maltese, which was published in 2019, had the highest local citation rate. According to the results of cluster analysis, four clusters were formed. The term "STEM" appears to be present in all clusters. "STEM Education" was also included in three clusters.
Published: 2022

Catalog

Books, media, physical & digital resources

See catalog results

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Language

Category

Publication Type

Journal

Region

Database

Publisher

545,973 results on '"Multivariate Analysis"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources