178,276 results on '"MEASUREMENT"'
Search Results
152. The Examination of 8th Grade African American Students' Performance on the SC Ready State Assessment Based on Predominantly African American Schools versus Predominantly White Schools
- Author
-
Tyrone Cummings
- Abstract
The causal-comparative research study examined if academic achievement differs between African American 8th-grade students in schools where they constitute the majority versus the minority in Clarendon County, South Carolina. The research questions investigated 8th-grade African American students' academic performance in English Language Arts and Mathematics, analyzing SC Ready test scores across schools where they make up either the majority or minority of the population. The study utilized a casual-comparative design across schools with predominantly African American and predominantly White student populations examining SC Ready scores in English Language Arts and Mathematics. Findings found no statistically significant differences in SC Ready scores between African American students attending predominantly African American schools and those in predominantly White schools for both ELA and Mathematics. These results suggest that school demographic composition alone does not significantly influence the academic performance of African American 8th-grade students in Clarendon County. The study highlights the importance of exploring additional factors beyond racial composition that may impact educational outcomes for these students. Future research could investigate the potential effects of diverse teaching and administrative staff, examining whether exposure to educators from various racial backgrounds contributes to improved academic achievement among African American students. [The dissertation citations contained here are published with the permission of ProQuest LLC. Further reproduction is prohibited without permission. Copies of dissertations may be obtained by Telephone (800) 1-800-521-0600. Web page: http://www.proquest.com/en-US/products/dissertations/individuals.shtml.]
- Published
- 2024
153. An Examination of Variability: Using Construct Measurement to Develop an Interdisciplinary Assessment
- Author
-
Fonya Crockett Scott
- Abstract
In an increasingly data-driven society, making decisions with data is a requirement not only for research scientists but also for navigating life more generally (Kjelvik & Schultheis, 2019). An informed citizenry must be prepared to discover patterns, predict outcomes, make decisions with data, and evaluate data-based claims for legitimacy and applicability (Franklin & Bargagliotti, 2020). In United States K-12 classrooms, the expectation of using data to support claims is an essential part of research methods at each level of science instruction. In fact, NGSS standards include analyzing and working with data at every grade level. Usiskin and Hall suggested that for most K-12 students, making decisions based on statistics is discussed more often in science than in mathematics (2015). Statistics instruction, however, has been primarily the responsibility of mathematics teachers. Consequently, conventional instruction of statistics in K-12 mathematics often presents computational aspects of statistics and stops short of experiences that allow students to explore and apply statistical ideas of measuring variability in context. Independent of instructional support, it can be difficult for students to knit together these ideas of calculations and context across disciplinary boundaries of math and science (Lehrer & Schauble, 2004; Makar & Confrey, 2003). Assessments are a tool that could support developing statistical literacy, revealing students' construction and integration of knowledge and skills acquired across multiple subjects when focused on interdisciplinary ideas, such as variability. Using the four building blocks of Wilson's (2005) framework for constructing measures, I developed assessment instruments to assess patterns of student thinking about variability for 6th-grade mathematics and science teachers. The assessment instruments were designed to collect evidence of connections students make between quantitative and qualitative descriptions of variation by prompting students to utilize statistical and scientific ideas. These assessments encouraged student thinking rooted in both mathematics and science and provided teachers in both disciplines with a more complete representation of students' growing understanding. Thoughtful completion of the items required integration between science content knowledge and the understanding of the phenomenon arising from the data as students interpret and explain based on the contextualized data rather than simply focusing on computation or data manipulation (Ben-Zvi et al., 2012). [The dissertation citations contained here are published with the permission of ProQuest LLC. Further reproduction is prohibited without permission. Copies of dissertations may be obtained by Telephone (800) 1-800-521-0600. Web page: http://www.proquest.com/en-US/products/dissertations/individuals.shtml.]
- Published
- 2024
154. Exploring Psychometric Properties and Determinants of PLAAFP Quality Scores
- Author
-
Christopher M. Claude
- Abstract
This dissertation comprises three complementary studies that aim to advance the understanding and practice of Individualized Education Programs (IEP) and Present Levels of Academic Achievement and Functional Performance (PLAAFP) development in special education. In the first study, we systematically reviewed empirical research measuring IEP quality, revealing heterogeneity in instruments used and a lack of standardized, replicated measures. In the second study, we established the reliability of scores obtained from a PLAAFP quality rubric, demonstrating strong interrater consistency (e.g., rank ICC = 0.91) and internal reliability (e.g., McDonald's [omega] = 0.82) through parametric and nonparametric analyses. In the third study, we investigated predictors of PLAAFP statement quality among graduate students. Results showed no significant impact of perceived school protocols but revealed improvements in PLAAFP scores after instruction (d = 0.59) and moderate social validity. Substantial variability across participants (45.1% ICC) and time points (10.1% ICC) highlighted individual differences and temporal dynamics influencing PLAAFP quality. Synthesizing findings across studies, key recommendations include: (a) replicating IEP quality measurement studies to accumulate reliability evidence; (b) establishing guidelines on sufficient reliability before implementing rubrics at scale; (c) emphasizing data skills in teacher preparation for composing high-quality IEPs; (d) exercising caution when standardizing IEP processes without reliability evidence. This dissertation provides a comprehensive examination of IEP and PLAAFP quality assessment, reliability, and determinants, informing teacher preparation practices and evidence-based policymaking to enhance educational outcomes for students with disabilities. [The dissertation citations contained here are published with the permission of ProQuest LLC. Further reproduction is prohibited without permission. Copies of dissertations may be obtained by Telephone (800) 1-800-521-0600. Web page: http://www.proquest.com/en-US/products/dissertations/individuals.shtml.]
- Published
- 2024
155. Measuring the Impact of a Race-Based Student Empowerment Program on Student Retention and Degree Attainment
- Author
-
Jennie Towner
- Abstract
A concerning reality in higher education is the persistent disparity in retention and graduation rates between students of color and White students. These persistent outcome differences are commonly referred to as opportunity or achievement gaps. To address these gaps, higher education institutions are enacting targeted approaches to mitigate or eliminate opportunity gaps. Harford Community College (HCC), the institution included in this study, created a race-based student empowerment program in the fall of 2014, known as the My College Success Network (MCSN) to meet the College's strategic plan goal of eradicating achievement gaps due to race, income, gender, and ethnicity. To date, the effectiveness of the MCSN program in meeting the original goal of eradicating achievement gaps has not been determined. This study evaluates whether MCSN predicts improved retention and graduation rates for Black/African American students, the population of students at the institution where the largest opportunity gaps are observed. Retention and graduation were tracked among unique student cohorts over time which eliminated the availability of a true control group when analyzing the impact of the program. Therefore, Comparative Interrupted Time Series (CITS) was used to compare the before-and-after changes in the outcomes for treatment and control groups to estimate the overall impact of the program. The study compared new Black/African American and White students from cohorts entering HCC between 2009 to 2018. Findings showed Black/African American participants in the MCSN academic coaching program were retained from fall-to-fall and graduated in three years at higher rates than Black/African American non-academic coaching participants. The graduation rate gap between Black/African American and White students decreased after MCSN program implementation in 2014. [The dissertation citations contained here are published with the permission of ProQuest LLC. Further reproduction is prohibited without permission. Copies of dissertations may be obtained by Telephone (800) 1-800-521-0600. Web page: http://www.proquest.com/en-US/products/dissertations/individuals.shtml.]
- Published
- 2024
156. Modeling Computational Thinking Using Multidimensional Item Response Theory: Investigation into Model Fit and Measurement Invariance
- Author
-
Emily A. Brown
- Abstract
Previous research has been limited regarding the measurement of computational thinking, particularly as a learning progression in K-12. This study proposes to apply a multidimensional item response theory (IRT) model to a newly developed measure of computational thinking utilizing both selected response and open-ended polytomous items to establish the factorial structure of the construct, apply the recently introduced composite and structured constructs models, and to investigate the measurement invariance of the assessment between males and females using the means and covariance structures (MACS) approach. [The dissertation citations contained here are published with the permission of ProQuest LLC. Further reproduction is prohibited without permission. Copies of dissertations may be obtained by Telephone (800) 1-800-521-0600. Web page: http://www.proquest.com/en-US/products/dissertations/individuals.shtml.]
- Published
- 2024
157. Three Papers on Measuring Teaching Practice
- Author
-
Ying Chen
- Abstract
This three-paper dissertation systematically investigates the methodology issues related to using observational systems to observe, analyze, and interpret teachers' reformed teaching practices within the K-12 classroom context. Each paper contributes a distinct perspective to evaluate instructional practices accurately, consistently, and effectively in science classrooms. The first paper offers a systematic literature review on classroom observation protocols (OPs) used for evaluating science teaching practices. In this study, I proposed an analytical framework, underpinning four crucial aspects of OP studies: the research objectives, design, data collection strategies, and data analysis and interpretation. This framework serves as the cornerstone for analyzing 37 distinct studies that have used OPs to evaluate the instructional practices of K-12 science teachers. The study underscores the need for transparent rater training procedure, sampling choices, and the advanced statistical techniques to address rater-associated variances and eventually enhance the reliability and validity of the teacher measures. The study also suggests continued development of OPs that are aligned with current and future science education standards and the application of advanced statistical methods to examine rater effects in classroom observation studies. In the second paper, we developed a comprehensive classroom observational system, utilizing the Rasch model to validate the instrument, guide rater training, and analyze observational data. We implemented this system with 321 full-length high school chemistry classroom observation videos. We proposed this system, because the exclusive reliance on developing new OPs may not lead to valid and reliable measurements of instructional practices, as ratings are often affected by construct-irrelevant variances such as rater bias and data analysis strategies. This study shifts the focus from the OPs alone to a classroom observation system that includes instrument validation, rating training, and data analytical strategies. The psychometric evidence in this study supported this system's feasibility of yielding reliable measures of instructional practices. The third paper examines rater effects by applying a Partial Credit Many-Facet Rasch Measurement (PC-MFRM) in a science classroom observation context. As shown in the first two papers, classroom observation studies employing OPs are mediated by human raters, who are susceptible to errors. These errors introduce construct-irrelevant variance, which can adversely impact the validity and reliability of measurement of instructional practices. This study aims to identify and control these variances, particularly focusing on three rater effects: rater severity, central tendency, and the halo effect. I applied PC-MFRM to concurrently examine the three rater effects and their impacts on measurement of instructional practices. MFRM results indicate the significant discrepancies in rater severity and identified the specific raters who showed central tendency and halo effects. The finding suggests that researchers can incorporate MFRM diagnostic information throughout the rating training and calibration process to reduce any potential halo and central tendency. To address the rater severity, researchers should focus on intra-rater consistency as well rather than inter-rater reliability only. This study extended the application of MFRM to classroom observation by offering a multi-dimensional understanding of rater effects; thereby it contributed to the improvement of rater training process, which can consequently enhance the reliability and validity in measuring teaching practice. The validity and reliability of classroom observations for evaluating teacher practices hinge considerably on rater effects and the systematic evaluation the impact of rater-related effects. While Observation Protocols (OPs) provide a structured framework for such evaluations, the literature indicates that rater behavior introduces construct-irrelevant variances such as rater severity, halo effect, and central tendency. Previous studies have employed the Many-Facet Rasch Model (MFRM) to examine some of the effects, but these efforts have predominantly focused on single-dimensional analyses. Addressing this research gap, current study applies MFRM to concurrently investigate three critical rater effects--rater severity, central tendency, and halo effect--in the context of a longitudinal efficacy study on the Connected Chemistry Curriculum (CCC). Specifically, we aim to (1) assess the degree to which MFRM can identify the presence of these rater effects, and (2) evaluate the implications of using MFRM on enhancing rater training programs for classroom observations. [The dissertation citations contained here are published with the permission of ProQuest LLC. Further reproduction is prohibited without permission. Copies of dissertations may be obtained by Telephone (800) 1-800-521-0600. Web page: http://www.proquest.com/en-US/products/dissertations/individuals.shtml.]
- Published
- 2024
158. Self-Directed Learning Research: A Systematic Review
- Author
-
Sara Nicole Reynolds
- Abstract
This study serves to collate and evaluate measures of self-directed learning (SDL), with the goal of guiding the measurement and discussion of SDL. Used in a variety of settings, many applications of SDL have been proposed, but a consistent definition has yet to be formulated. Despite the lack of a cohesive definition, several tools exist to measure SDL. Within this study, which implemented the preferred reporting items for systematic reviews and meta-analyses (PRISMA) and Consensus-Based Standards for the Selection of Health Status Measurement Instruments (COSMIN) protocols, 157 articles were analyzed for content and themes were identified. An important finding of this study was a definite lack of cohesion in application and understanding of SDL as a framework. While some regard it as a stand-alone learning intervention, others address it as a personality trait. Close examination of the instruments used to measure SDL led to the conclusion that it is both inappropriate and ineffective to continue using, as they broadly lack construct validity and generalizability. Limitations of this study are single subject research, number studies available within databases used, and lack of raw data from studies covered. Future research surrounding the conceptual framework and instrumentation is indicated to further develop the field's understanding of SDL's value and implications. [The dissertation citations contained here are published with the permission of ProQuest LLC. Further reproduction is prohibited without permission. Copies of dissertations may be obtained by Telephone (800) 1-800-521-0600. Web page: http://www.proquest.com/en-US/products/dissertations/individuals.shtml.]
- Published
- 2024
159. Improving Cross-Cultural Comparability: Does School Leadership Mean the Same in Different Countries?
- Author
-
Nurullah Eryilmaz and Andres Sandoval Hernandez
- Abstract
Recently, there has been increasing interest in comparing educational leadership measures, such as principal school leadership, using International Large-Scale Assessments (ILSAs) data. However, there are doubts about the uniformity of measurement across countries participating in the ILSAs. There are concerns that the robustness and psychometric characteristics of measures are adversely affected by socio-cultural, economic, political, and linguistic diversity across countries. The current study examines the uniformity of cross-cultural model data for the "principal instructional leadership scale" using the framework and data supplied by the Organization for Economic Cooperation and Development (OECD)'s Teaching and Learning International Survey is employed to estimate the conceptual measurement model and test measurement invariance across forty-eight countries. Countries are then divided countries into more homogenous groups, based on their socio-demographic characteristics, to test measurement invariance within these sub-groups. The results of this study reveal that, when testing for the forty-eight countries together, the scale measuring principals' school leadership is invariant across all countries only at an intermediate level (i.e. metric). This means the factor structures and the factor loadings are equivalent across countries, but the item intercepts are not. However, when testing within sub-groups, improvements in cross-cultural comparability are found. This paper concludes by making suggestions on scale improvement, discussing the implications of this study for policymaking and making recommendations for future research.
- Published
- 2024
- Full Text
- View/download PDF
160. A Geometric Journey toward Genuine Multipartite Entanglement
- Author
-
Songbo Xie
- Abstract
This thesis focuses on the challenge of characterizing multipartite entanglement. While the study of bipartite entanglement is well-documented in scientific literature, recognizing that entanglement can involve more than two parties--i.e. three or more parties---is crucial, as multipartite entanglement enables the completion of more complicated tasks in quantum information science. Previous discussions on entanglement, especially within scenarios such as information scrambling, primarily concentrated on bipartite entanglement, thus overlooking the rich landscape of multipartite entanglement. By involving more parties, multipartite entanglement exhibits a larger degree of nonlocality, significantly deepening our insights into the dynamical properties of quantum many-body systems, going far beyond what has been revealed through bipartite entanglement. Despite its long-recognized importance, a proper quantification of multipartite entanglement, along with the understanding of the "genuine multipartite entanglement" criterion, continues to pose substantial challenges. The work in this thesis reveals an unexpected connection between multipartite entanglement and the geometry of simplices. Specifically, we demonstrate that every three-qubit state can be associated with a triangle, with its area measuring the genuine tripartite entanglement within that state. Similarly, every four-qubit state can be associated with a tetrahedron, with its volume measuring the genuine quadripartite entanglement within that state. With these results, we embark on a geometric journey toward addressing the quantification problem of genuine multipartite entanglement, offering new perspectives on the complexity of even larger quantum many-body systems. [The dissertation citations contained here are published with the permission of ProQuest LLC. Further reproduction is prohibited without permission. Copies of dissertations may be obtained by Telephone (800) 1-800-521-0600. Web page: http://www.proquest.com/en-US/products/dissertations/individuals.shtml.]
- Published
- 2024
161. Mathematical Connections Involved in Area Measurement Processes
- Author
-
S. Caviedes, G. De Gamboa, and E. Badillo
- Abstract
The present study seeks to explore the mathematical connections that 13-14-year-old secondary school students establish when solving area tasks. Emphasis is placed on different mathematical objects, and the connections between them, that allow students to successfully solve the tasks. The study follows a mixed methodology using qualitative and quantitative method of analysis. The results show that representations play a key role in solving area tasks, as they condition the use of alternative procedures to the use of formulas. Likewise, the properties involved in area measurement processes may condition the use of geometric procedures, such as surface decomposition and reorganisation. Finally, results show that to accurately carry out area measurement processes, it is necessary to bring different mathematical objects into play simultaneously. If these connections between mathematical objects do not occur, there is a risk of using the formulas in a mechanical way.
- Published
- 2024
- Full Text
- View/download PDF
162. Pathways to Performance: The Experimental Impacts of Learning Trajectory-Oriented Formative Assessment in Mathematics
- Author
-
Jonathan A. Supovitz, Caroline B. Ebby, and Gregory Collins
- Abstract
Purpose: A growing trend in instructional improvement efforts is the use of formative assessment informed by research-based developmental trajectories of how students gain deeper understanding of subject matter content over time. This article reports the findings of a large-scale experimental study of an innovative mathematics professional development program called the Ongoing Assessment Project (OGAP) in a large urban school district that is based on the theory of learning trajectory-oriented formative assessment. Research Methods: This research employs a randomized controlled trial and structural equation modeling. Findings: The study provides causal evidence of the impact of OGAP on teachers' knowledge of student thinking and student learning outcomes. The results further demonstrate that OGAP professional development and ongoing supports significantly enhanced teachers' knowledge of formative assessment and improved grades 3-5 student outcomes on the Pennsylvania State high-stakes standardized tests in mathematics. Structural equation modeling of within-treatment variation demonstrates the relationship between teacher implementation and student performance, identifies both individual teacher characteristics and program supports related to implementation and performance, and raises additional areas for further investigation. Implications: Although formative assessment has long been viewed as a potentially powerful way to provide teachers with feedback on student understanding, the integration of learning trajectories helps address the "Now what?" question by providing teachers with more specific and actionable guidance to make more informed instructional responses. This study joins the small set of causal research on effective mathematics professional development programs that utilize learning trajectory-oriented formative assessment that point the way to promising pathways to improved student performance.
- Published
- 2024
- Full Text
- View/download PDF
163. Considerations for the Use of Plausible Values in Large-Scale Assessments
- Author
-
Paul A. Jewsbury, Yue Jia, and Eugenio J. Gonzalez
- Abstract
Large-scale assessments are rich sources of data that can inform a diverse range of research questions related to educational policy and practice. For this reason, datasets from large-scale assessments are available to enable secondary analysts to replicate and extend published reports of assessment results. These datasets include multiple imputed values for proficiency, known as "plausible values." Plausible values enable the analysis of achievement in large-scale assessment data with complete-case statistical methods such as t-tests implemented in readily-available statistical software. However, researchers are often challenged by the complex and unfamiliar nature of plausible values, large-scale assessments, and their datasets. Misunderstandings and misuses of plausible values may therefore arise. The aims of this paper are to explain what plausible values are, why plausible values are used in large-scale assessments, and how plausible values should be used in secondary analysis of the data. Also provided are answers to secondary researchers' frequently asked questions about the use of plausible values in analysis gathered by the authors during their experience advising secondary users of these databases.
- Published
- 2024
- Full Text
- View/download PDF
164. A Network Ethnography of International Large-Scale Assessment Contracting: Scientific Knowledge as Messy, Provisional, Complex, and Subjective
- Author
-
Chloe O'Connor and Camilla Addey
- Abstract
While methodologies are often presented as standardised procedures which, under specified conditions, should lead to the same conclusions, this case study presents the complex and deeply personal process of research (Addey and Piattoeva 2022). We analyse our application of network ethnography -- an approach presented by Ball (2016) -- to the study of contractors developing international large-scale assessments, exploring how we, as scholars, become with our methodology and navigate the 'messiness' of research (Law 2004). Drawing on Science and Technology Studies (STS) to understand the constitutive role of methodology and performativity of knowledge-making (Law and Singleton 2013, Rimpiläinen 2015), we show how methodological decisions construct what is studied and ourselves. Finally, we discuss challenges of visual representation, applying Galloway's (2011) 'conversion rules' to examine what was unrepresented -- or unrepresentable. This paper shows the complex, subjective, and provisional nature of knowledge, theorising 'heterogeneity and variation' (Law 2004) as an inherent part of methodological application.
- Published
- 2024
- Full Text
- View/download PDF
165. Essays on Statistics and Data Science Education
- Author
-
Emma Mary Klugman
- Abstract
Statistics & data science are growing, rapidly evolving, and increasingly important for an informed citizenry in a data-saturated world. In this dissertation, I address two central questions: (1) who is taking statistics? and (2) what are statistics courses teaching? I estimate that 920,000 US students take statistics in high school each year, but this population has not yet been well studied. Using a rich set of survey responses describing 15,727 students' demographics, career interests and values, STEM identity, grades, and test scores, my first study compares four groups of high-school course-takers: those who take statistics, calculus, both, and neither. I then employ latent profile analysis to shed light on who these students are, showing that students with different profiles take statistics at surprisingly similar rates: statistics is as an important part of the academic pathway for a wide range of students and serves a demographically diverse population. In my second study, I build upon tools from natural language processing and psychometric measurement to develop a human-in-the-loop methodology for measuring latent constructs in large text corpora, and present a framework for doing so. I construct a lexicon-based instrument to measure the extent to which syllabi from college statistics and data science courses align with a vision for modernizing instruction set forth in the Guidelines for Assessment and Instruction in Statistics Education (GAISE) project and across 145 journal articles spanning almost a century. In so doing, I illustrate an approach that researchers can take in bringing measurement questions to text data, a method that I believe strikes a useful balance between interpretability, communicability, validity, and scalability. My final study applies these instruments to 32,483 syllabi from US statistics and data science courses taught between 2010 and 2018. I find a modest overall increase in modern approaches over this decade. Finally, I explore differences between institution types using multilevel models, finding that private and four-year institutions, as well as those with higher admissions rates and Pell-recipient populations, have more modern syllabi, though two-year institutions and schools serving fewer Pell recipients seem to be gaining ground. [The dissertation citations contained here are published with the permission of ProQuest LLC. Further reproduction is prohibited without permission. Copies of dissertations may be obtained by Telephone (800) 1-800-521-0600. Web page: http://www.proquest.com/en-US/products/dissertations/individuals.shtml.]
- Published
- 2024
166. From Gaze to Grades: How Multimedia Signaling Influences Attention and Learning Outcomes
- Author
-
Berj Akian
- Abstract
This dissertation investigates the problem of how variations in the intensity of three selected constructs--multimedia signaling, speed and pacing, and cognitive engagement prompts--affect attention and learning outcomes in online learning environments. The study explores the intersecting cognitive theories of cognitive load, higher order thinking skills, and Mayer's principles of multimedia learning, with specific focus on his signaling principle. Through the use of eye-tracking technology, the study measures focal attention, while immediate and delayed knowledge retention tests assess learning outcomes. Employing a robust experimental design, the research utilizes eye-tracking technology to directly measure focal attention, alongside both immediate and delayed knowledge retention tests to evaluate learning outcomes. The methodological framework modulates the intensity of selected constructs across low, optimal, and high conditions, enabling a comprehensive assessment of their impacts. The findings reveal statistically significant effects for multimedia richness and speed, indicating optimal levels that enhance learner engagement. This research concludes that carefully calibrated multimedia signals can substantially benefit online learning environments, offering educators and content creators actionable insights for designing more effective and engaging educational experiences. [The dissertation citations contained here are published with the permission of ProQuest LLC. Further reproduction is prohibited without permission. Copies of dissertations may be obtained by Telephone (800) 1-800-521-0600. Web page: http://www.proquest.com/en-US/products/dissertations/individuals.shtml.]
- Published
- 2024
167. On New and Improved Measures for Item Analysis from Signal Detection Theory
- Author
-
Rachel Lee
- Abstract
Classical item analysis (CIA) entails summarizing items based on two key attributes: item difficulty and item discrimination, defined as the proportion of examinees answering correctly and the difference in correctness between high and low scorers. Recent insights reveal a direct link between these measures and aspects of signal detection theory (SDT) in item analysis, offering modifications to traditional metrics and introducing new ones to identify problematic items (DeCarlo, 2023). The SDT approach involves extending Luce's choice model (1959) using a mixture framework, with mixing occurring within examinees rather than across them, reflecting varying latent knowledge states (know or don't know) across items. This implies a 'true' split (know/don't know) enabling straightforward discrimination and difficulty measures, lending theoretical support to the conventional item splitting approach. DeCarlo (2023) demonstrated improved measures and item screening using simple median splits, motivating this study to explore enhanced measures via refined splits. This study builds on these findings, refining CIA and SDT measures by integrating additional information like response time and item scores using latent class and cluster models. [The dissertation citations contained here are published with the permission of ProQuest LLC. Further reproduction is prohibited without permission. Copies of dissertations may be obtained by Telephone (800) 1-800-521-0600. Web page: http://www.proquest.com/en-US/products/dissertations/individuals.shtml.]
- Published
- 2024
168. Schools, Data and Teachers' Learning: Insights of an Experienced Educator
- Author
-
Ian Hardy
- Abstract
Schooling in Australia has become subject to increased processes of data-based governance. This article draws upon the insights of an experienced teacher, 'Meriam', who, having taught more than 34-years over almost a 50-year span, reflected upon the nature of such changes. Utilising theorising in relation to datafication processes and accountability logics, the article elaborates Meriam's efforts to engage productively with increased attention to measuring and monitoring students' performance via various modes of data. This is in the context of increased pressure in the state of Queensland to justify teachers' practices, and to show how their students' learning had shown improvement on these data over short periods of time. In this sense, the article challenges a more post-performative conception of professionalism as Meriam struggled to comprehend how the increased attention to a plethora of data could serve as a useful vehicle to inform teachers' work and learning; this was the case even as she found some benefits in increased attention to her own practice and learning. The article delineates the constraints associated with increased attention to data, and how current managerial and neoliberal policy conditions in schools may contribute to potentially significant and harmful effects for both teacher and student learning.
- Published
- 2024
- Full Text
- View/download PDF
169. How Building Knowledge Boosts Literacy and Learning: First Causal Study Finds Outsized Impacts at 'Core Knowledge' Schools
- Author
-
David Grissmer, Mark Berends, Daniel T. Willingham, Chelsea A. K. Duran, William M. Murrah, Tanya Evans, Chris S. Hulleman, Jamie Decoster, Thomas G. White, and Richard Buddin
- Abstract
Educators and researchers have been fighting the reading wars for the last century, with battles see-sawing literacy instruction in American schools from phonics to whole language and, most recently, back to phonics again. Over the last decade, 32 states and the District of Columbia have adopted new "science of reading" laws that require schools to use curricula and instructional techniques that are deemed "evidence-based." Such reading programs include direct instruction in phonics and reading comprehension skills, such as finding the main idea of a paragraph, and efforts to accelerate learning tend to double down on more of the same skill-building practice. The authors conduct the first-ever experimental study of this topic, based on randomized kindergarten-enrollment lotteries in nine Colorado charter schools that use an interdisciplinary knowledge-based curriculum called Core Knowledge.
- Published
- 2024
170. Knowledge Utilisation Analysis: Measuring the Utilisation of Knowledge Sources in Policy Decisions
- Author
-
Jonas Videbaek Jørgensen
- Abstract
Background: Understanding knowledge utilisation in policymaking is a core task for the social and political sciences. However, limitations and biases abound in commonplace approaches to measuring such use. Consequently, we have little systematic evidence of the extent to which knowledge sources are used in policy decisions. Aims and objectives: This article discusses existing approaches to studying knowledge utilisation and introduces the analytical approach, Knowledge Utilisation Analysis (KUA), which harnesses the growing quantities of documents available online. Methods: KUA offers a four-step procedure that enables researchers to systematically compare policy documents with knowledge sources and measure the degree to which policy decisions follow or contradict relevant knowledge. Findings: The article showcases KUA in a study of Danish primary education and active labour market policies from 2016 to 2021. By analysing 1,159 documents, KUA is leveraged to study levels of knowledge utilisation across policy areas, research methods, and provider types. Discussion and conclusion: KUA contributes methodological innovation to measuring knowledge utilisation by systematically matching knowledge sources with policy decisions. KUA can, thereby, enhance empirical research on the relationship between knowledge and policy.
- Published
- 2024
- Full Text
- View/download PDF
171. Laboratory Learning Objectives: Ranking Objectives across the Cognitive, Psychomotor and Affective Domains within Engineering
- Author
-
Sasha Nikolic, Thomas F. Suesse, Sarah Grundy, Rezwanul Haque, Sarah Lyden, Ghulam M. Hassan, Scott Daniel, Marina Belkina, and Sulakshana Lal
- Abstract
The literature on laboratory objectives in engineering education research is scattered and inconsistent. Systematic literature reviews identified the need for better understanding. This paper ranks the laboratory learning objectives across the cognitive, psychomotor and affective domains to improve scaffolding. It provides an opportunity for reflection, a pathway to confirm assessment alignment, and opens future research areas. To accomplish this, the Laboratory Learning Objectives Measurement (LLOM) instrument is used to survey 160 academics from around the world representing 18 engineering disciplines. The results suggest that the collective ranking order does represent a framework that can be used broadly. However, for greater alignment with consensus thinking, discipline rankings should be used. The cognitive domain was deemed the most important. These results provide the community's opinion and may not necessarily be best practice, providing an opportunity for reflection.
- Published
- 2024
- Full Text
- View/download PDF
172. Investigating Volume Estimation Performance and Strategies of 6th-Grade Children and Adults
- Author
-
Despina Desli and Panagiotis Dimitropoulos
- Abstract
The present study aimed to examine children's and adults' performance and strategies in volume estimations. Three tasks were designed and presented to 40 adults and 40 6th grade children who were asked to estimate: (a) the number of unit measures (bricks/rice spoons) that may be put in a container, (b) the quantity of bricks/rice, between two different quantities of bricks/rice, that could fit in a container, and (c) the container, between two containers, that may hold a quantity given. Participants were -- randomly and in equal numbers from each age group -- assigned to four groups differing in their opportunity for trial condition and the presence of the half reference point in the containers. Although the majority of participants showed quite low performance rates in estimating volume, their estimations were more successful when the trial condition was available to them as well as when the trials referred to small containers and discrete quantities. The mental counting strategy was mainly used by both age groups. Adults, however, often used the algorithm and processed precise volume measurement. These results are further discussed in relation to educational implications as well as to the perspective of early supporting children's abilities for estimations at the primary school.
- Published
- 2024
- Full Text
- View/download PDF
173. Early Mathematical Performance of Deaf and Hard of Hearing Toddlers in Family-Centred Early Intervention Programmes
- Author
-
Loes Wauters, Claudia M. Pagliaro, Karen L. Kritzer, and Evelien Dirks
- Abstract
Research indicates that establishing a strong foundation in early mathematics is essential for later academic learning. Previous research with students who are deaf or hard of hearing (DHH) has shown varying differences in the performance and achievement when compared to typically hearing (TH) students. While the majority of research in this area has been conducted in the United States, studies in other countries suggest that these differences may be global. The present study investigated the early mathematics abilities of 3-year-old DHH children enrolled in family-centred early intervention in the Netherlands. Fifty-three DHH and TH children were given an adapted version of the Early Mathematics Performance Diagnostic. Results showed that on average, the DHH and the TH children performed similarly on all domains, except for Measurement. Likewise, both groups showed similar mathematical knowledge in most early mathematics tasks measuring sub-concepts such as counting objects, shape matching, or measuring weight. Differences were identified in some basic tasks measuring the sub-concepts (e.g. rote counting, measuring time, solving puzzles), however, not on the more advanced tasks measuring these same sub-concepts. These findings are important for parents, teachers, and early interventionists.
- Published
- 2024
- Full Text
- View/download PDF
174. The Effect of the Mathematics Bag Early Education Program
- Author
-
Abdülbaki Ergel and Yasemin Aydogan
- Abstract
In this study, the effect of the Mathematics Bag Early Education Program (MAÇEP) on the mathematics skills (number/counting, geometry, measurement) of 57-69-month-old preschool children was investigated. A quasi-experimental design with a pretest, posttest, follow-up test, and control group were used in the study. The study group consisted of 22 children attending preschool education and their parents. In the study, MACEP was applied to the experimental group in the form of 50 activities for 12 wk outside the preschool education program. Data were collected using the Early Mathematics Test (EMAT) and Parent Focus Group Interview Form. Mann Whitney U Test, Wilcoxon Signed Rank Test, Friedman Test and content analysis were used to analyze the data. At the end of the study, it was determined that MACEP effectively improved the mathematics skills (number/counting, geometry, measurement) of 57-69-month-old children in the experimental group and the retention continued after the experimental period.
- Published
- 2024
- Full Text
- View/download PDF
175. Item Response Theory: A Modern Measurement Approach to Reliability and Precision for Counseling Researchers
- Author
-
Ryan M. Cook and Stefanie A. Wind
- Abstract
The purpose of this article is to discuss reliability and precision through the lens of a modern measurement approach, item response theory (IRT). Reliability evidence in the field of counseling is primarily generated using Classical Test Theory (CTT) approaches, although recent studies in the field of counseling have shown the benefits of using IRT approaches to explore measurement precision. We discuss the theoretical foundations and assumptions of CTT and IRT, and examine how modern measurement theory (i.e. IRT) poses advantages for capturing measurement precision. We use an example analysis to demonstrate the indices of measurement precision in IRT approaches (e.g. standard errors, person-fit, item-fit, targeting, residual-based fit statistics). Finally, we discuss practical and clinical implications for counseling researchers for using IRT approaches to measure precision, including insights into precision for persons, items, and measurement invariance as well as the utility of brief and adaptive scales.
- Published
- 2024
- Full Text
- View/download PDF
176. Measuring Innovation: Perspectives from Engineering Education and Clean Energy
- Author
-
Shruti Misra
- Abstract
Measuring innovation is key to realizing innovation in practice. One of the primary reasons why measurement is important stems from the fundamental principle that what is measured is, in turn, what garners attention and action. Systematic measurement of innovation can enable researchers and practitioners to propose and undertake strategic interventions that enable effective action and use of resources. However, a large proportion of innovation measures developed in literature are catered towards researchers, economists and analyst who are removed from the innovation process. These measures do not address the needs of the practitioners and actors who are actively involved in innovative activities. Therefore, the foundational question that this dissertation seeks to answer is: How can innovation be measured in a way that bridges the study of the innovation process with its practice? To answer this research question, I weave together three studies where I produced contributions to diverse domains. I leverage the robustness of the technological innovation system (TIS) framework to ground my research. The first study looked at industry sponsored engineering design capstones as a "laboratory" for small-scale innovation. Through this study the first framework of measurement emerged, one that categorized measures of innovation as evaluative and actor-centric. The second study took a broader approach to investigate perceptions of innovation measures across diverse actors in an innovation system. This study resulted in the second measurement framework that placed measures on an availability-influence framework that balanced the availability of different measures with their decision-making influence. The final study validated this framework by applying it to assess the long duration energy storage (LDES) innovation system. The study not only yielded strategic interventions that could enhance LDES innovation, but also validated the process of using the frameworks that emerged in previous studies to assess innovation systems. Together, these three studies yielded two novel frameworks that are born out of studying innovation and the practices of innovation in diverse, interdisciplinary settings. This dissertation provides an expansive, cross-disciplinary view of how measures that evaluate the process of innovation can be combined with measures that encompass the diverse practices of innovation. Through this work, I argue that combining the study of the system of innovation with the practice of innovation can help bridge the gap between both realms and can ultimately enhance both. [The dissertation citations contained here are published with the permission of ProQuest LLC. Further reproduction is prohibited without permission. Copies of dissertations may be obtained by Telephone (800) 1-800-521-0600. Web page: http://www.proquest.com/en-US/products/dissertations/individuals.shtml.]
- Published
- 2024
177. ChatGPT and the Course Vulnerability Index
- Author
-
Nodir Adilov, Jeffrey W. Cline, Hui Hanke, Kent Kauffman, Lisa Meneau, Elva Resendez, Shubham Singh, Mike Slaubaugh, and Nichaya Suntornpithug
- Abstract
This article develops an index to measure the level of susceptibility of courses to cheating using ChatGPT (Chat Generative Pre-trained Transformer), an advanced text-based artificial intelligence (AI) language model. It demonstrates the application of the index to a sample of business courses in a mid-sized university. The study finds that the vulnerability index varies across disciplines and teaching modalities. As advanced language models become more common in academic settings and create new educational challenges, the study provides an intuitive and practical mechanism for instructors and academic units to measure and assess the vulnerability of their courses to various language-based predictive models.
- Published
- 2024
- Full Text
- View/download PDF
178. A Systematic Review of Academic Resilience in East Asia: Evidence from the Large-Scale Assessment Research
- Author
-
Jia-qi Zheng, Kwok-cheung Cheung, and Pou-seong Sit
- Abstract
Although academic resilience is of great concern to contemporary educational practitioners, there is no consensus on its measurement. Furthermore, protective factors characterized by East Asian societal contexts remain ambiguous. This systematic review aims to offer an overview of the operational definitions, statistical methodology, and protective factors of academic resilience identified in East Asian countries/economies. With a focus on large-scale assessment (LSA) research, three databases (i.e., Web of Science, CNKI, and AiritiLibary) were searched and returned 31 peer-reviewed studies over the last decade. Results indicated that the definition-driven method was commonly adopted in international LSA studies (e.g., Programme for the International Student Assessment) to measure academic resilience, and the research conducted in national/regional LSA (e.g., China Education Panel Survey) tended to use the process-driven approach. Logistic regression was the most frequently data analysis technique utilized in the definition-driven approach, while structural equation modeling and mediation/moderation analyses accounted for the largest proportion of the process-driven methods. Our study shed light on the methodological issues of academic resilience in LSA. Additionally, it highlights the aspiration of educational researchers to identify Asian-specific protective factors from the social-ecological perspective to propose appropriate interventions fostering academic resilience.
- Published
- 2024
- Full Text
- View/download PDF
179. A Bayesian Workflow for the Analysis and Reporting of International Large-Scale Assessments: A Case Study Using the OECD Teaching and Learning International Survey
- Author
-
David Kaplan and Kjorte Harra
- Abstract
This paper aims to showcase the value of implementing a Bayesian framework to analyze and report results from international large-scale assessments and provide guidance to users who want to analyse ILSA data using this approach. The motivation for this paper stems from the recognition that Bayesian statistical inference is fast becoming a popular methodological framework for the analysis of educational data generally, and large-scale assessments more specifically. The paper argues that Bayesian statistical methods can provide a more nuanced analysis of results of policy relevance compared to standard frequentist approaches commonly found in large-scale assessment reports. The data utilized for this paper comes from the Teaching and Learning International Survey (TALIS). The paper provides steps in implementing a Bayesian analysis and proposes a workflow that can be applied not only to TALIS but to large-scale assessments in general. The paper closes with a discussion of other Bayesian approaches to international large-scale assessment data, in particularly for predictive modeling.
- Published
- 2024
- Full Text
- View/download PDF
180. Which Way Does Time Go? Differences in Expert and Novice Representations of Temporal Information at Extreme Scales Interferes with Novice Understanding of Graphs
- Author
-
Ilyse Resnick, Elizabeth Louise Chapman, and Thomas F. Shipley
- Abstract
Visual representations of data are widely used for communication and understanding, particularly in science, technology, engineering, and mathematics (STEM). However, despite their importance, many people have difficulty understanding data-based visualizations. This work presents a series of three studies that examine how understanding time-based Earth-science data visualizations are influenced by scale and the different directions time can be represented (e.g., the Geologic Time Scale represents time moving from bottom-to-top, whereas many calendars represent time moving left-to-right). In Study 1, 316 visualizations from two top scholarly geoscience journals were analyzed for how time was represented. These expert-made graphs represented time in a range of ways, with smaller timescales more likely to be represented as moving left-to-right and larger scales more likely to be represented in other directions. In Study 2, 47 STEM novices were recruited from an undergraduate psychology experiment pool and asked to construct four separate graphs representing change over two scales of time (Earth's history or a single day) and two phenomena (temperature or sea level). Novices overwhelmingly represented time moving from left-to-right, regardless of scale. In Study 3, 40 STEM novices were shown expert-made graphs where the direction of time varied. Novices had difficulty interpreting the expert-made graphs when time was represented moving in directions other than left-to-right. The study highlights the importance of considering representations of time and scale in STEM education and offers insights into how experts and novices approach visualizations. The findings inform the development of educational resources and strategies to improve students' understanding of scientific concepts where time and space are intrinsically related.
- Published
- 2024
- Full Text
- View/download PDF
181. Evaluating Model Fit of Measurement Models in Confirmatory Factor Analysis
- Author
-
David Goretzko, Karik Siemund, and Philipp Sterner
- Abstract
Confirmatory factor analyses (CFA) are often used in psychological research when developing measurement models for psychological constructs. Evaluating CFA model fit can be quite challenging, as tests for exact model fit may focus on negligible deviances, while fit indices cannot be interpreted absolutely without specifying thresholds or cutoffs. In this study, we review how model fit in CFA is evaluated in psychological research using fit indices and compare the reported values with established cutoff rules. For this, we collected data on all CFA models in "Psychological Assessment" from the years 2015 to 2020 (N[subscript Studies]=221). In addition, we reevaluate model fit with newly developed methods that derive fit index cutoffs that are tailored to the respective measurement model and the data characteristics at hand. The results of our review indicate that the model fit in many studies has to be seen critically, especially with regard to the usually imposed independent clusters constraints. In addition, many studies do not fully report all results that are necessary to re-evaluate model fit. We discuss these findings against new developments in model fit evaluation and methods for specification search.
- Published
- 2024
- Full Text
- View/download PDF
182. Purposeful and Purposeless Aging: Structural Issues for Sense of Purpose and Their Implications for Predicting Life Outcomes
- Author
-
Gabrielle N. Pfund, Gabriel Olaru, Mathias Allemand, and Patrick L. Hill
- Abstract
Despite the value of sense of purpose during older adulthood, this construct often declines with age. With some older adults reconsidering the relevance of purpose later in life, the measurement of purpose may suffer from variance issues with age. The current study investigated whether sense of purpose functions similarly across ages and evaluated if the predictive power of purpose on mental, physical, cognitive, and financial outcomes changes when accounting for a less age-affected measurement structure. Utilizing data from two nationwide panel studies (Health and Retirement Study: n = 14,481; Midlife in the United States: n = 4,030), the current study conducted local structural equation modeling and found two factors for the positively and negatively valenced purpose items in the Purpose in Life subscale (Ryff, 1989), deemed the purposeful and purposeless factor. These factors become less associated with each other at higher ages. When reproducing past findings with this two-factor structure, the current study found that the purposeful and purposeless factors predicted these outcomes in the same direction as would be suggested by past research, but the magnitude of these effects differed for some outcomes. The discussion focuses on the implications of what this means for our understanding of sense of purpose across the lifespan.
- Published
- 2024
- Full Text
- View/download PDF
183. The Interoceptive Sensibility in Middle Childhood: The Italian Validation of the Self-Awareness Questionnaire
- Author
-
Simona Raimo, Teresa Iona, Antonella Di Vita, Maddalena Boccia, Valentina Torchia, Silvia Canino, Mariachiara Gaita, Maria Cropano, and Liana Palermo
- Abstract
Interoception refers to the processing of internal bodily states and plays a critical role in motivational processes and behaviours. Instruments for assessing its subjective components (i.e., interoceptive sensibility) during childhood are crucially needed. Thus, in this study, we adapted and evaluated the psychometric properties of the Self-Awareness Questionnaire (SAQ) in a sample of Italian children. The SAQ language was revised, and 239 children, ranging in age from 7 to 13 years, completed the adapted questionnaire (SAQ-C). A two-factor solution (visceral and somatosensorial domains) with 24 items showed the best fit of the data. The reliability was good. Convergent validity was fair, and the results indicated good discriminant validity. Moreover, correlational and regression analyses showed that interoceptive sensibility, particularly that associated with visceral sensations, increased with age. Current findings provide preliminary evidence that the SAQ-C may be a reliable and valid tool for assessing interoceptive sensibility in middle childhood.
- Published
- 2024
- Full Text
- View/download PDF
184. Elementary Students' Fraction Reasoning: A Measurement Approach to Fractions in a Dynamic Environment
- Author
-
Sheunghyun Yeo and Corey Webel
- Abstract
In this study, we examine students' mathematical reasoning within a technological environment designed to support understanding of relationships between quantities with adjustable measuring units. In particular, we provide a cross-sectional snapshot of how 30 elementary students (Grades 3-5) engaged in a series of fraction-as-measurement tasks using a "Dynamic Ruler" that could continuously dilate unit sizes. Screencast recordings were collected from a task-based clinical interview and analyzed to investigate children's mathematical actions and mathematical ideas. Students' reasoning patterns were characterized using four distinct types (low attending, holistic estimating, determining, and commeasuring) based on their solution strategies. Findings suggest that the Dynamic Ruler tool can elicit rich conceptions of fractions and even prompt novel approaches such as commeasurement. We conclude by drawing insights into how elementary students might use dynamic technology meaningfully.
- Published
- 2024
- Full Text
- View/download PDF
185. Finding the Right Grain-Size for Measurement in the Classroom
- Author
-
Mark Wilson
- Abstract
This article introduces a new framework for articulating how educational assessments can be related to teacher uses in the classroom. It articulates three levels of assessment: macro (use of standardized tests), meso (externally developed items), and micro (on-the-fly in the classroom). The first level is the usual context for educational measurement, but one of the contributions of this article is that it mainly focuses on the latter two levels. Co-ordination of the content across these two levels can be achieved using the concept of a construct map, which articulates the substantive target property at levels of detail that are appropriate for both teacher planning and within-classroom use. This article then describes a statistical model designed to span these two levels and discusses how best to relate this to the macrolevel. Results from a curriculum and instruction development project on the topic of measurement in the elementary school are demonstrated, showing how they are empirically related.
- Published
- 2024
- Full Text
- View/download PDF
186. Measurement Invariance of the Group Climate Inventory across Adults with and without Mild Intellectual Disability in Secure Residential Facilities
- Author
-
Turhan, Abdullah, Roest, Jesse J., Delforterie, Monique J., Van der Helm, G. H. Peer, Neimeijer, Elien G., and Didden, Robert
- Abstract
Background: The Group Climate Inventory (GCI) was tested for measurement invariance across 332 adults with and 225 adults without mild intellectual disabilities in Dutch forensic treatment, and for latent mean differences on its "Support", "Growth", "Repression", and "Atmosphere" subscales. Method: Multigroup confirmatory factor analysis was used to evaluate the configural, threshold, and loading and threshold invariance of the GCI across both groups, and to compare group latent means on each subscale. Results: Measurement invariance was found across groups. Latent mean group comparisons showed small but significant differences reflected in lower scores on Support and Atmosphere in the group with mild intellectual disabilities. Conclusion: The GCI allows meaningful comparisons between clients with and without mild intellectual disabilities in secure facilities. Results from the between-group comparisons suggest that consideration should be given as to whether, and why, the support and atmosphere perceptions of clients with mild intellectual disabilities might be less good.
- Published
- 2024
- Full Text
- View/download PDF
187. Estimation of Individuals' Collaborative Problem Solving Ability in Computer-Based Assessment
- Author
-
Meijuan Li, Hongyun Liu, Mengfei Cai, and Jianlin Yuan
- Abstract
In the human-to-human Collaborative Problem Solving (CPS) test, students' problem-solving process reflects the interdependency among partners. The high interdependency in CPS makes it very sensitive to group composition. For example, the group outcome might be driven by a highly competent group member, so it does not reflect all the individual performances, especially for a low-ability member. As a result, how to effectively assess individuals' performances has become a challenging issue in educational measurement. This research aims to construct the measurement model to estimate an individual's collaborative problem-solving ability and correct the impact of partners' abilities. First, 175 eighth graders' dyads were divided into six cooperative groups with different levels of problem-solving (PS) ability combinations (i.e., high-high, high-medium, high-low, medium-medium, medium-low, and low-low). Then, they participated in the test of three CPS tasks, and the log data of the dyads were recorded. We applied Multidimensional Item Response Theory (MIRT) measurement models to estimate an individual's CPS ability and proposed a mean correction method to correct the impact of group composition on individual ability. Results show that (1) the multidimensional IRT model fits the data better than the multidimensional IRT model with the testlet effect; (2) the mean correction method significantly reduced the impact of group composition on obtained individual ability. This study not only successfully increased the validity of individuals' CPS ability measurement but also provided useful guidelines in educational settings to enhance individuals' CPS ability and promote an individualized learning environment.
- Published
- 2024
- Full Text
- View/download PDF
188. Identification of Important Factors When Measuring School Climate: Latent Construct Validation and Exploration
- Author
-
Siebert, Carl F., Holloway, Stefanie D., DuBois, David L., Bavarian, Niloofar, Lewis, Kendra M., and Flay, Brian
- Abstract
Background: Researchers regularly must decide what information is necessary to understand school climate and how to include climate in a study. For example, which factors and/or scales should be used, is using just 1 scale for school climate sufficient, and to what extent does the selection of a single scale influence the research findings? Aims: Understanding what factors to consider and which available scales to review will assist those interested in measuring school climate. Methods: This study explores 8 validated scales related to school climate. Data used are from a previous study (Social and Character Development cooperative agreement funded by IES #R305L030072 and #R305A080253) that looked at Positive Action, a social emotional and character development program for elementary-, middle-, and high-school students. Results and Conclusion: Scale correlations and factor analyses show how these scales work together to measure overall middle school climate.
- Published
- 2024
- Full Text
- View/download PDF
189. Emergent Learning about Measurement and Uncertainty in an Inquiry Context: A Case from an Elementary Classroom
- Author
-
Tang, Xiaowei, Shu, Gang, Wei, Bing, and Levin, Daniel
- Abstract
Students often learn about measurement uncertainty as an isolated topic, with a focus on generalizable strategies to manage uncertainty in scientific investigation. In this study, we report and analyze a case of emergent learning about measurement and uncertainty, in which students in a Chinese elementary school science class explored and reconciled discrepancies in hypotheses by constructing and using measures and making inferences. Adopting a model-based view of measurement, we show that when allowed to take on emergent measurement uncertainty while inquiring into causes for phenomena, late elementary students with no prior experience can engage in sophisticated reasoning characterized by a variety of theoretical modeling practices. Developing and aligning models of the phenomenon, the measure and the measurement data supported students in constructing an intuitive solution to their discrepancies. Our analysis also identified (1) a pattern of thinking and some common assumptions students adopted in their modeling practice that were productive, and (2) contextual elements affording and constraining emergent learning on measurement and uncertainty. In our discussion, we reflect on the educational potential of adopting a model-based account of measurement and of treating measurement and uncertainty as integrated into investigative practice. We also discuss the necessary contexts for realizing the potential of the model-based account.
- Published
- 2024
- Full Text
- View/download PDF
190. Using Explicit Instruction and Virtual Manipulatives to Teach Measurement Concepts for Students with Autism Spectrum Disorder
- Author
-
Di Liu, Catharine Lory, Qingli Lei, Weiwei Cai, Yiwen Mao, and Xuan Yang
- Abstract
Measurement concepts are an essential foundation for more advanced mathematical concepts. To address the challenges of students with autism spectrum disorder (ASD) in learning measurement concepts, this study investigated the effects of using a combination of explicit instruction and virtual manipulatives (VMs) to teach measurement concepts to students with ASD in China. Using a single-case multiple-probe across skills design, researchers examined whether the intervention could support the acquisition and maintenance of measurement concepts in students with ASD. Based on visual analysis, a functional relation was found between the independent variable (i.e., explicit instruction with VMs) and student performance on solving measurement concepts problems. Implications for practice and research are discussed.
- Published
- 2024
- Full Text
- View/download PDF
191. Combining Machine Translation and Automated Scoring in International Large-Scale Assessments
- Author
-
Ji Yoon Jung, Lillian Tyack, and Matthias von Davier
- Abstract
Background: Artificial intelligence (AI) is rapidly changing communication and technology-driven content creation and is also being used more frequently in education. Despite these advancements, AI-powered automated scoring in international large-scale assessments (ILSAs) remains largely unexplored due to the scoring challenges associated with processing large amounts of multilingual responses. However, due to their low-stakes nature, ILSAs are an ideal ground for innovations and exploring new methodologies. Methods: This study proposes combining state-of-the-art machine translations (i.e., Google Translate & ChatGPT) and artificial neural networks (ANNs) to mitigate two key concerns of human scoring: inconsistency and high expense. We applied AI-based automated scoring to multilingual student responses from eight countries and six different languages, using six constructed response items from TIMSS 2019. Results: Automated scoring displayed comparable performance to human scoring, especially when the ANNs were trained and tested on ChatGPT-translated responses. Furthermore, psychometric characteristics derived from machine scores generally exhibited similarity to those obtained from human scores. These results can be considered as supportive evidence for the validity of automated scoring for survey assessments. Conclusions: This study highlights that automated scoring integrated with the recent machine translation holds great promise for consistent and resource-efficient scoring in ILSAs.
- Published
- 2024
- Full Text
- View/download PDF
192. From Global Domains to Physical Activity Environments: Development and Initial Validation of a Questionnaire-Based Physical Literacy Measure Designed for Large-Scale Population Surveys
- Author
-
Peter Elsborg, Paulina S. Melby, Mette Kurtzhals, Helene Kirkegaard, Johannes Carl, Steffen Rask, Peter Bentsen, and Glen Nielsen
- Abstract
This study aimed to develop and test MyPL, a questionnaire that measures self-reported physical literacy (PL) among children and adolescents. First, the item pool was developed and adapted, and face validity was tested with cognitive interviewing. Then, factor structures were identified through multidimensional scaling and exploratory factor analyses in a sample of 951 children (ages 7-13). Then, the identified models were tested using confirmatory factor analyses (CFA) within a sample of 2861 children (ages 7-12) and a sample of 1518 children (ages 13-15). Finally, measurement invariance and predictive validity were investigated. CFA showed that the identified physical activity (PA) environment-based models fitted better with the data than the domain-derived model. Reliability analyses showed that the internal consistency of the total PL scale was good and that, the reliability of the identified scales, except the cognitive scale, of the two models based on PA environments was satisfactory. Additionally, MyPL also showed measurement invariance across gender. This study suggests that the type of PA and the environment in which PA occur is important to consider when designing PL measurement tools. This study indicated that MyPL can be used to measure children and young people's PL in large surveys.
- Published
- 2024
- Full Text
- View/download PDF
193. Growing Intercultural Speakers in Novice Italian: A Virtual Versus Face-to-Face Comparison
- Author
-
Aletha Stahl, Tatjana Babic Williams, Lan Jin, and Jane Koch
- Abstract
Intercultural competence (IC) has been identified as a crucial outcome of world language education. The purpose of the study is to compare possible differences in IC development between face-to-face and asynchronous virtual modes of delivery that were taken as emergency measures early in the COVID-19 pandemic for a beginning Italian course with 18- to 22-year-old students. The American Association of Colleges & Universities Intercultural Knowledge and Competence Valid Assessment of Learning in Undergraduate Education Rubric serves as a theoretical framework to determine learning outcomes and guide qualitative assessment. Applying a mixed-methods approach, the study collects quantitative data using the Intercultural Knowledge and Competence Short Scale and qualitative data from student reflection assignments for both face-to-face (2019) and asynchronous virtual (2020) courses. A comparison of IC development between the two cohorts shows similar achievement of intercultural learning in both modes. Implications for IC development in language classrooms are discussed.
- Published
- 2024
- Full Text
- View/download PDF
194. Pupil Dilation as an Index of Examinee's Cognitive Load in Answering a Mathematics Question: A Comparison Study of Different Approaches
- Author
-
Tzu-Hua Wang and Chien-Hui Kao
- Abstract
Studies have demonstrated that task-evoked pupillary responses (TEPRs) can be adopted to measure the examinee's cognitive load. This study compared three approaches for the measurement of TEPRs, mean pupil diameter, mean pupil dilation, and mean percentage of pupil dilation, to determine the best-fit measuring method. The valid participants of this study were eight sixth-grade elementary students. The experimental materials used were two mathematics questions with differing difficulty. The generalized estimating equation (GEE) was employed to compare the goodness of fit of each approach. The results revealed that the measurement of TEPRs based on the mean percentage of pupil dilation measured every 4 s provided the best fit.
- Published
- 2024
- Full Text
- View/download PDF
195. Confirmatory Factor Analysis in Kinesiology Journals with Explicit Measurement Focus
- Author
-
Christine E. Pacewicz, Christopher R. Hill, Haeyong Chun, and Nicholas D. Myers
- Abstract
Confirmatory factor analysis (CFA) is a commonly used statistical technique. Recommendations for evaluating CFA highlight scholars should outline the expected model, conduct data screening, report model estimation and evaluation, and report key information about results to provide evidence for latent variables. The purpose of the current study was to review and evaluate the use of CFA in research published from 2013 to 2022 in the journal of "Measurement in Physical Education and Exercise Science" (MPEES), the most measurement focused journal in kinesiology, in our view. Results were cross-checked by examining research published in Research Quarterly for Exercise and Sport during the same time period. Strengths included providing information about the expected model, evaluating the final model, and reporting parameter estimates. Areas for improvement included conducting data screening and evaluating factor quality. Finally, recommendations for scholars to improve reporting of CFA in kinesiology are provided.
- Published
- 2024
- Full Text
- View/download PDF
196. Using Plausible Values When Fitting Multilevel Models with Large-Scale Assessment Data Using R
- Author
-
Francis L. Huang
- Abstract
The use of large-scale assessments (LSAs) in education has grown in the past decade though analysis of LSAs using multilevel models (MLMs) using R has been limited. A reason for its limited use may be due to the complexity of incorporating both plausible values and weighted analyses in the multilevel analyses of LSA data. We provide additional functions in R that extend the functionality of the WeMix (Bailey et al., 2023) package to allow for the automatic pooling of plausible values. In addition, functions for model comparisons using plausible values and the ability to export output to different formats (e.g., Word, html) are also provided.
- Published
- 2024
- Full Text
- View/download PDF
197. The Limits of Inference: Reassessing Causality in International Assessments
- Author
-
David Rutkowski, Leslie Rutkowski, Greg Thompson, and Yusuf Canbolat
- Abstract
This paper scrutinizes the increasing trend of using international large-scale assessment (ILSA) data for causal inferences in educational research, arguing that such inferences are often tenuous. We explore the complexities of causality within ILSAs, highlighting the methodological constraints that challenge the validity of causal claims derived from these datasets. The analysis begins with an overview of causality in relation to ILSAs, followed by an examination of randomized control trials and quasi-experimental designs. We juxtapose two quasi-experimental studies demonstrating potential against three studies using ILSA data, revealing significant limitations in causal inference. The discussion addresses the ethical and epistemological challenges in applying quasi-experimental designs to ILSAs, emphasizing the difficulty in achieving robust causal inference. The paper concludes by suggesting a framework for critically evaluating quasi-experimental designs using ILSAs, advocating for a cautious approach in employing these data for causal inferences. We call for a reevaluation of methodologies and conceptual frameworks in comparative education, underscoring the need for a multifaceted approach that combines statistical rigor with an understanding of educational contexts and theoretical foundations.
- Published
- 2024
- Full Text
- View/download PDF
198. Impact Evaluations of Teacher Preparation Practices: Challenges and Opportunities for More Rigorous Research
- Author
-
Zid Mancenido
- Abstract
Many teacher education researchers have expressed concerns about the lack of rigorous impact evaluations of teacher preparation practices. I summarize these various concerns as they relate to issues of internal validity, measurement, and external validity. I then assess the prevalence of these issues by reviewing 166 impact evaluations of teacher preparation practices published in peer-reviewed journals between 2002-2019. Although I find that very few studies address issues of internal validity, measurement, and external validity, I highlight some innovative approaches and present a checklist of considerations to assist future researchers in designing more rigorous impact evaluations.
- Published
- 2024
- Full Text
- View/download PDF
199. Capturing the Spark: PISA, Twenty-First Century Skills and the Reconstruction of Creativity
- Author
-
Sue Grey and Paul Morris
- Abstract
Creativity has fascinated scholars for generations, and its identification as one of the key 'twenty-first century skills' necessary for economic growth has led to renewed interest. This creates two challenges for the OECD: its flagship Programme of International Student Assessment (PISA) does not directly measure creativity. Secondly, the increased importance attached to creativity has highlighted claims that high performers on PISA are largely nations stereotyped as lacking creativity. This challenges PISA's self-proclaimed status as the premier global benchmark for evaluating and comparing the quality of school systems and weakens its capacity to deliver its core mission; to identify 'best practices' which ensure economic prosperity. We explore these challenges and examine both how the OECD has responded to them and is moving to include creativity in PISA 2022. We argue that, while a precise definition of creativity has defied scholars for centuries, the indications are that the OECD's metric will focus on a narrow, convergent and easily-measured conception associated with cognitive competencies and linked to enhancing human capital. In this way, the 'messiness' around the polysemic concept will be simultaneously both exploited and threatened, as new, measurable versions displace alternatives.
- Published
- 2024
- Full Text
- View/download PDF
200. MxML (Exploring the Relationship between Measurement and Machine Learning): Current State of the Field
- Author
-
Yi Zheng, Steven Nydick, Sijia Huang, and Susu Zhang
- Abstract
The recent surge of machine learning (ML) has impacted many disciplines, including educational and psychological measurement (hereafter shortened as "measurement"). The measurement literature has seen rapid growth in applications of ML to solve measurement problems. However, as we emphasize in this article, it is imperative to critically examine the potential risks associated with involving ML in measurement. The MxML project aims to explore the relationship between measurement and ML, so as to identify and address the risks and better harness the power of ML to serve measurement missions. This paper describes the first study of the MxML project, in which we summarize the state of the field of applications, extensions, and discussions about ML in measurement contexts with a systematic review of the recent 10 years' literature. We provide a snapshot of the literature in (1) areas of measurement where ML is discussed, (2) types of articles (e.g., applications, conceptual, etc.), (3) ML methods discussed, and (4) potential risks associated with involving ML in measurement, which result from the differences between what measurement tasks need versus what ML techniques can provide.
- Published
- 2024
- Full Text
- View/download PDF
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.