Descriptor: "TEST validity" / Publication Year Range: Last 10 years / Region: united states - Searchworks@Jio Institute Digital Library Search Results

Your search keyword '"TEST validity"' showing total 255 results

Start Over Descriptor "TEST validity" Publication Year Range Last 10 years Region united states

255 results on '"TEST validity"'

1. Assessing Anhedonia in Adolescents: The Psychometric Properties and Validity of the Dimensional Anhedonia Rating Scale

Author: Jackson M. A. Hewitt, Bita Zareian, and Joelle LeMoult
Abstract: Adolescent anhedonia is a multidimensional construct defined as the loss of enjoyment or pleasure across multiple domains of life. Anhedonia is concurrently associated with substantial impairment and distress, and it prospectively predicts the onset, severity, and treatment of depression. Despite its demonstrated importance, a limited number of anhedonia measures are validated for adolescents. The current study assessed the psychometric properties of the Dimensional Anhedonia Rating Scale (DARS) in 400 English-speaking, 12- to 19-year-old adolescents. Overall, the DARS demonstrated good convergent and discriminant validity, but sub-optimal concurrent validity. The strengths and limitations of the DARS and its utility as a measure of adolescent anhedonia are discussed. Furthermore, future directions for the construction of measures of adolescent anhedonia are outlined.
Published: 2024
Full Text: View/download PDF

2. Psychometric Evaluation of the Hope-Action Inventory in Individuals with Substance Use Issues

Author: Lauren N. Currie, Robinder P. Bedi, and Anita M. Hubley
Abstract: This study evaluated the psychometric properties of the Hope-Action Inventory (HAI) scores with a problematic substance use population (N = 783). The hierarchical seven-factor structure of the HAI fit the data well. Further, the HAI scores had satisfactory internal consistency reliability and good convergent evidence for validity.
Published: 2024
Full Text: View/download PDF

3. Psychometric Validity and Measurement Invariance of the Caring for Bliss Scale in the Philippines and the United States

Author: Jesus Alfonso D. Datu, Frank Fincham, and Jet U. Buenconsejo
Abstract: The Caring for Bliss Scale (CBS) is a new measure that assesses an individuals' capacity to cultivate inner joy and happiness. Developed in the United States, its generalizability remains unknown in non-Western contexts. This research explored the scale's cross-national invariance among college students in the Philippines (n = 546) and the United States (n = 643). A multi-group confirmatory factor analysis using maximum likelihood estimation showed that the unidimensional model of caring for bliss exhibited configural, metric, scalar, and residual invariance across the Filipino and the U.S. samples. This scale also had good internal consistency estimates in both settings. In both contexts, caring for bliss was positively correlated with well-being and negatively correlated with different negative quality of life indicators (i.e., stress, anxiety, and depression). This study offered preliminary evidence regarding the cross-national applicability of the CBS in different cultural settings during the COVID-19 pandemic.
Published: 2024
Full Text: View/download PDF

4. Studies on Education, Science, and Technology 2022

Author: International Society for Technology, Education and Science (ISTES) Organization, Noroozi, Omid, Sahin, Ismail, Noroozi, Omid, Sahin, Ismail, and International Society for Technology, Education and Science (ISTES) Organization
Abstract: Education, science, and technology disciplines are closely and extensively connected in all formats and levels. The outbreak of COVID-19 has further squeezed this interconnection where the delivery of education in different scientific fields of studies at all education levels is almost impossible without the presence of technology. Today, there is a need more than ever to explore the intersection of education, science, and technology at both administrative and classroom levels. Educational leaders and policymakers should be aware of the requirements (e.g., role of culture, educational governance) for effective teaching and learning in the post-COVID-19 era. Teachers, instructors, and researchers need to be proficient in the way to convey knowledge with effective and innovative adoption of technology (e.g., online peer feedback) to the young generation as they are called "digital natives". This book focuses on addressing and exploring these needs and recommends solutions from multiple perspectives. The book is divided into three sections related to studies on education, science, and technology. While each of the fist two sections includes five chapters, the last section involves four chapters. The chapters' contributors are from the following countries: Albania, Australia, Azad Kashmir, Ghana, Indonesia, Iran, Kazakhstan, Morocco, Philippines, Singapore, the Netherlands, the USA, Tunisia, and Turkey. The diversity of the chapters from 14 different countries brings an international perspective to the book. [For the 2021 edition, see ED617831.]
Published: 2022

5. A Scoping Review on the Use of the Parents Evaluation of Developmental Status and PEDS: Developmental Milestones Screening Tools

Author: Abdoola, Shabnam, Swanepoel, De Wet, and Van Der Linde, Jeannie
Abstract: The Parents' Evaluation of Developmental Status (PEDS), PEDS: Developmental Milestones (PEDS: DM) and PEDS tools (i.e., the PEDS and PEDS:DM combined for use) are parent-reported screening tools frequently used to identify young children requiring early intervention. An ideal screening tool for all contexts would be brief, inexpensive with appropriate test items and good psychometric properties. A scoping review was conducted to review studies that used the PEDS, PEDS:DM, and PEDS tools to screen for the need for further referrals and evaluation through parent report. Thirty articles, ranging from 2003 to 2020, conducted in high-income countries (HICs) and lower-middle income countries (LMICs), were included from the 1,468 records identified. Studies conducted in HICs (n = 19) included screening of special population groups and comparing validated tools. LMIC studies (n = 11) focused on translations, combination of the PEDS tools, validations of tools, and use of an app-based tool (mHealth). High referral rates were obtained with PEDS (23-41%) and PEDS:DM (12-54%) in LMICs where at-risk populations are more prevalent and cultural differences may affect tool validity. A global dearth of research on PEDS:DM and PEDS tools exist; the review highlights factors that influence the validity and impact widespread use of the screening measures, especially in diverse populations and LMICs.
Published: 2023
Full Text: View/download PDF

6. Cross-Cultural Validation of the Short Form of the Life Skills Scale for Adolescents and Adults in Adolescents in Four Countries

Author: Kase, Takayoshi and Endo, Shintaro
Abstract: This study aimed to translate the Japanese version of the Life Skills Scale for Adolescents and Adults (LSSAA) into Chinese, English, and Korean, simplify it, and assess its reliability and validity. Validation was performed using individual data of 9941 high-school students from China, Japan, Korea, and the United States collected by the 2021 "Survey on Experiences and Attitudes Related to the Corona Crisis" conducted by the National Institution For Youth Education. Confirmatory factor analysis showed that the four-factor model of the LSSAA fit the data for all four countries. Testing of the measurement invariance of the four-factor model among the four countries supported the adoption of a weak invariance model, and the LSSAA scores were comparable across all four countries. These results suggest that the LSSAA has good reliability and validity and applies to adolescents in English-speaking countries and some Asian counties.
Published: 2023
Full Text: View/download PDF

7. Developing an Innovative Elicited Imitation Task for Efficient English Proficiency Assessment. TOEFL® Research Report. RR-96. ETS RR-21-24

Author: Davis, Larry and Norris, John
Abstract: The elicited imitation task (EIT), in which language learners listen to a series of spoken sentences and repeat each one verbatim, is a commonly used measure of language proficiency in second language acquisition research. The "TOEFL® Essentials"™ test includes an EIT as a holistic measure of speaking proficiency, referred to as the "Listen and Repeat" task type. In this report, we describe the design considerations that informed the development of the EIT for TOEFL Essentials. We also report the results of a series of investigations conducted during the prototyping and pilot phases of test development, which were undertaken with the goal of confirming task design specifications, evaluating scoring performance, and obtaining initial validity evidence to support score interpretation and use of the EIT in the TOEFL Essentials test. We found that task design variables generally performed as expected. The length of input sentence was strongly associated with performance (Pearson r = 0.88), consistent with the construct measured by the EIT, while other task variables not directly related to the EIT construct did not impact performance (e.g., graphics, speaker accent, and response time). Scorers drawn from TOEFL iBT test raters were able to score responses consistently with over 98% exact or adjacent interrater agreement on a 6-point scale, and scores on the pilot version of the EIT were highly reliable (Cronbach's [alpha] = 0.93 on the 15-item pilot version). Correlations between EIT scores and other measures were generally as expected: Correlations with other speaking tasks were high (0.78-0.84) and slightly to somewhat lower for other language measures (0.73 for writing, 0.68 for listening, and 0.57 for reading). Correlation with an independent measure of holistic language proficiency (C-test) was moderately high (0.69), as expected. We discuss the study findings in terms of the TOEFL Essentials test validity argument and point out limitations to the current results along with future research needs. Overall, we believe that the findings provide initial support to warrant the use of the EIT as operationalized in the TOEFL Essentials test.
Published: 2021

8. Adaptation and Validation of BullyHARM-China--A Chinese Version of the Bullying, Harassment, and Aggression Receipt Measure

Author: Yang, Jingyi, Ferraz, Raul, Shi, Dexin, Harrison, Sayward E., Ye, Zhi, Chen, Lihua, and Lin, Danhua
Abstract: Bullying is a growing concern in China, yet there are few validated scales designed to measure different types of bullying among Chinese children. In this present study, a bilingual team of researchers use a forward-backward translation process to adapt the Bullying, Harassment, and Aggression Receipt Measure (BullyHARM) for Chinese youth. BullyHARM has previously been shown to be a reliable scale for measuring six bullying domains (i.e., physical, verbal, social/relational, cyber, property, sexual) among children in the United States (US). After cultural and linguistic adaptation, we enrolled 397 middle school students from Beijing, China in a validation study to assess the psychometric properties of the new BullyHARM-China scale. Results of confirmatory factor analysis suggest the final 21-item scale displays strong internal consistency. Consistent with findings from the US, the first-order model of six factors (i.e., six bullying subscales) displays the best fit to the data. Our findings suggest that BullyHARM-China is a reliable tool for measuring bullying victimization among Chinese students.
Published: 2023
Full Text: View/download PDF

9. Remote First-Language Assessment: Feasibility Study with Vietnamese Bilingual Children and Their Caregivers

Author: Dam, Quynh Diem and Pham, Giang T.
Abstract: Purpose: There is a shortage of bilingual speech-language pathologists (SLPs) in the United States. For Vietnamese, less than 1% of SLPs speak the language compared with a Vietnamese American population of > 2.1 million. This study examines the feasibility and social validity of remote child language assessment with the help of a caregiver to address the need for first language assessments among Vietnamese-speaking children. Method: Twenty-one dyads of caregivers and typically developing children (aged 3-6 years) completed two assessment sessions in their first language, Vietnamese, using Zoom videoconferencing. Sessions were counterbalanced between two conditions in which either the clinician or the caregiver was the task administrator. Children's language samples were elicited using narrative tasks. Social validity was also assessed through caregiver and child questionnaires at the end of each session. Results: There were no significant differences between conditions on language sample measures nor the measures of social validity. Both caregivers and their children felt positively about the sessions. The caregivers' feelings were related to their perception of children's feelings about the sessions. Children's feelings were related to their Vietnamese language proficiency, caregiver-reported language ability, and whether they were born outside of the United States. Conclusions: Findings build the evidence base for telepractice as an effective and socially valid service delivery model for bilingual children in the United States. This study supports the potential for caregivers as task administrators in a telepractice setting, making assessment in a child's first language more feasible and accessible. Future investigation is needed to extend results to bilingual populations with disorders.
Published: 2023
Full Text: View/download PDF

10. The Effect of Learning Environment on the Selection of Conventional Expressions on an Aural Multiple-Choice DCT

Author: Bardovi-Harlig, Kathleen and Su, Yunwen
Abstract: This exploratory study examines the role of foreign and second language contexts in the acquisition of conventional expressions. A group of 21 ESL learners was compared to 25 EFL learners randomly selected from a larger pool. Both groups completed an aural multiple-choice discourse completion task (MC-DCT), which was developed from a previously validated oral DCT. The aural MC-DCT consisted of 21 items with learner-generated options delivered aurally. A total of 91 native speakers also completed the task as a control group. The results showed an effect of learning environment on learners' selection of conventional expressions. The ESL group selected the conventional expressions in more items than the EFL group on the aural MC-DCT; the differences in the selections by the two groups were item-specific. The observed effect of learning context is discussed as related to individual items and type and modality of the task. The paper also discusses the special make-up of the ESL group due to the pandemic and expansion of the group for future research.
Published: 2021

11. Emotion Recognition from Realistic Dynamic Emotional Expressions Cohere with Established Emotion Recognition Tests: A Proof-of-Concept Validation of the Emotional Accuracy Test

Author: Israelashvili, Jacob, Pauw, Lisanne S., Sauter, Disa A., and Fischer, Agneta H.
Abstract: Individual differences in understanding other people's emotions have typically been studied with recognition tests using prototypical emotional expressions. These tests have been criticized for the use of posed, prototypical displays, raising the question of whether such tests tell us anything about the ability to understand spontaneous, non-prototypical emotional expressions. Here, we employ the Emotional Accuracy Test (EAT), which uses natural emotional expressions and defines the recognition as the match between the emotion ratings of a target and a perceiver. In two preregistered studies (N[subscript total] = 231), we compared the performance on the EAT with two well-established tests of emotion recognition ability: the Geneva Emotion Recognition Test (GERT) and the Reading the Mind in the Eyes Test (RMET). We found significant overlap (r > 0.20) between individuals' performance in recognizing spontaneous emotions in naturalistic settings (EAT) and posed (or enacted) non-verbal measures of emotion recognition (GERT, RMET), even when controlling for individual differences in verbal IQ. On average, however, participants reported enjoying the EAT more than the other tasks. Thus, the current research provides a proof-of-concept validation of the EAT as a useful measure for testing the understanding of others' emotions, a crucial feature of emotional intelligence. Further, our findings indicate that emotion recognition tests using prototypical expressions are valid proxies for measuring the understanding of others' emotions in more realistic everyday contexts.
Published: 2021

12. Development and Validation of a Questionnaire to Assess Situational Interest in a Science Period: A Study in Three Cultural/Linguistic Contexts

Author: Potvin, Patrice, Ayotte-Beaudet, Jean-Philippe, Hasni, Abdelkrim, Smith, Jonathan, Giamellaro, Michael, Lin, Tzung-Jin, and Tsai, Chin-Chung
Abstract: This article reports an international initiative to develop and validate a "situational interest questionnaire" in three cultural/linguistic contexts: Canada (French), USA (English), and Taiwan (Chinese). The 20-item solution ([alpha] = 0.90) presented four factors: "enjoyment," "value," "attention/sustained work," and "usefulness." These factors and the convergence of the three translations were partially verified through confirmatory factor analysis (CFA). A shorter, unidimensional 7-item solution is also presented ([alpha] = 0.76) and was confirmed with CFA for two out of the three languages (with all [alpha] > 0.75). This questionnaire, named ISiQ, thus presents "good" or "acceptable" psychometric properties while securing better coverage of the situational interest construct than most surveys do. Reflections about its possible uses and about the stability/volatility of situational interest items are presented.
Published: 2023
Full Text: View/download PDF

13. Response Format Changes the Reading the Mind in the Eyes Test Performance of Autistic and Non-Autistic Adults

Author: Lim, Alliyza, Brewer, Neil, Aistrope, Denise, and Young, Robyn L.
Abstract: The Reading the Mind in the Eyes Test (RMET) is a purported theory of mind measure and one that reliably differentiates autistic and non-autistic individuals. However, concerns have been raised about the validity of the measure, with some researchers suggesting that the multiple-choice format of the RMET makes it susceptible to the undue influence of compensatory strategies and verbal ability. We compared the performance of autistic (N = 70) and non-autistic (N = 71) adults on the 10-item multiple-choice RMET to that of a free-report version of the RMET. Both the autistic and non-autistic groups performed much better on the multiple-choice than the free-report RMET, suggesting that the multiple-choice format enables the use of additional strategies. Although verbal IQ was correlated with both multiple-choice and free-report RMET performance, controlling for verbal IQ did not undermine the ability of either version to discriminate autistic and non-autistic participants. Both RMET formats also demonstrated convergent validity with a well-validated adult measure of theory of mind. The multiple-choice RMET is, however, much simpler to administer and score.
Published: 2023
Full Text: View/download PDF

14. Reliability and Validity of Light-Intensity Physical Activity Scales in Adults: A Systematic Review

Author: Tanaka, Rumi, Yakushiji, Kanako, Tanaka, Satomi, Tsubaki, Michihiro, and Fujita, Kimie
Abstract: This study aimed to review the validity and/or reliability of light-intensity physical activity (LPA) questionnaires and identify the most suitable questionnaires for measurement of LPA in adults. Following the PRISMA-P 2020 guidelines, we searched MEDLINE, PsycINFO, CINAHL, Scopus, Embase, and MedNar. Only studies that targeted adults [greater than or equal to]18 years old and used LPA measured by accelerometer and/or heart rate monitor as an objective criterion were included. The search resulted in 2748 article hits, from which we extracted 16 studies with 14 questionnaires. The 7-Day Sedentary and LPA Log and LPA Questionnaire were specifically designed for LPA measurement, and the Community Health Activities Model Program for Seniors physical activity self-report questionnaire scale has been revised for LPA measurement. These questionnaires had comparatively high reliability and validity in this review. Most studies contained methodological limitations such as test-retest period. In the future, more accurate reliability/validity studies should be conducted for each questionnaire.
Published: 2023
Full Text: View/download PDF

15. Validating a TPACK Instrument for 7-12 Mathematics In-Service Middle and High School Teachers in the United States

Author: Smith, Pamela G. and Zelkowski, Jeremy
Abstract: This study fills a gap with TPACK instrumentation by validating a survey instrument for use specifically with secondary mathematics in-service teachers in the United States using an instrument originally developed in Australia (Handal et al., 2012). A comparable national sample was surveyed in the U.S. to the original Australian instrumentation study. Findings revealed the factor structure of the Australian TPACK instrument differed when used in the U.S. and presents a new validated instrument (TPACK-M-US) for use with secondary mathematics in-service teachers in the U.S. We provide three sources of validity evidence (e.g. Instrument or Test Content, Internal Structure, Response Processes). Appropriate uses and interpretations are discussed in addition to the importance of validation research for educational settings.
Published: 2023
Full Text: View/download PDF

16. The Development and Validation of an Intercultural Competencies Assessment Instrument for K-12 In-Service Educators

Author: Lynn, David Ellsworth
Abstract: As schools adapt curriculum and learning environments to better prepare students for entry into an increasingly globalized society, cultivating intercultural competencies in K-12 in-service educators is of heightened importance. The purpose of this study was to develop and validate a new instrument designed to assess these competencies called the Intercultural Competency Measure for Educators (ICME). Byram (1997) defines intercultural competencies as the ability to effectively communicate, understand, and work with people from diverse cultural backgrounds. Deardorff (2006) adds to this a call for action, which lends itself to the critical cosmopolitanism framework that guides this study. A pilot study was used to develop a four-factor theoretical intercultural competencies framework through a process defined in this study. Reliability and validity were examined using data collected from K-12 in-service educators at schools in the United States and Canada. An Exploratory Factor Analysis suggested a revision of the constructs to include five factors: Curriculum, Diverse Student Inclusion, Cross-Cultural Openness, Collaboration and Adaptation, and Systematic Awareness. Construct validity was tested using Confirmatory Factor Analysis and supported by examining demographic data using parametric tests. The emergence of a factor related to systematic awareness highlights teachers' increased role in addressing the root causes of inequity in schools. The five-factor model provides a framework for schools wishing to further develop and assess intercultural competencies growth in teachers. [The dissertation citations contained here are published with the permission of ProQuest LLC. Further reproduction is prohibited without permission. Copies of dissertations may be obtained by Telephone (800) 1-800-521-0600. Web page: http://www.proquest.com/en-US/products/dissertations/individuals.shtml.]
Published: 2023

17. Family Reminiscence Scale: A Measure of Early Communicative Context

Author: Öner, Sezin, Ece, Berivan, and Gülgöz, Sami
Abstract: We developed and validated the Family Reminiscence Scale (FARS) in which adults rate their frequency of reminiscing with their parents about childhood experiences. In three studies, we characterized how FARS was related to adults' recollections of their earliest memories in different cultural contexts. First, we examined the factorial structure of FARS and obtained two factors of reminiscing: first-time events and general-recurrent events. In the second study, confirmatory factor analyses were conducted, in which we established measurement invariance across gender and age groups. In Study 3, we tested the factorial structure of FARS in an American sample to ensure cross-cultural invariance. We also showed that the two factors were differentially related to the phenomenology of earliest memories in samples from Turkey and United States (Study 2 & Study 3). Overall, FARS was found to be reliable and valid to measure for adult samples to assess the quality of the linguistic input during childhood. Predictive value of FARS has been shown across different gender, age, and culture groups, underlining the organizational role of the early communicative context in the phenomenology and linguistic style of adults' early memories.
Published: 2020

18. Examining the Factor Structure and Measurement Invariance of the Difficulties in Emotion Regulation Scale across Taiwanese and American University Students

Author: Yeh, Yun-Jy, Chen, Jyun-Hong, Tsai, William, and Kimel, Sasha
Abstract: The Difficulties in Emotion Regulation Scale (DERS) is a widely used measure of emotion dysregulation. However, limited research has examined its factor structure and measurement invariance in cross-national samples. The present study tested competing measurement models and the measurement invariance of the DERS in university student samples from the United States (n = 324) and Taiwan (n = 399). Results indicated that the bifactor model with the Awareness subscale items removed demonstrated the best fit. The results of model-based indices provided evidence for the general emotion dysregulation factor of the DERS. Cross-national measurement invariance testing found partial strong invariance. These findings indicate that DERS would best be used as a measure of general emotion dysregulation among college students in the United States and Taiwan. These findings emphasize that future work is needed to examine cross-national differences in the construct and assessment of emotion dysregulation.
Published: 2022
Full Text: View/download PDF

19. The General Academic Self-Efficacy Scale: Psychometric Properties, Longitudinal Invariance, and Criterion Validity

Author: van Zyl, Llewellyn E., Klibert, Jeff, Shankland, Rebecca, See-To, Eric W. K., and Rothmann, Sebastiaan
Abstract: Academic self-efficacy (ASE) refers to a student's global belief in his/her ability to master the various academic challenges at university and is an essential antecedent of wellbeing and performance. The five-item General Academic Self-Efficacy Scale (GASE) showed promise as a short and concise measure for overall ASE. However, of its validity and reliability outside of Scandinavia is limited. Therefore, this paper aimed to investigate the psychometric properties, longitudinal invariance, and criterion validity of the GASE within a sample of university students (Time 1: n = 1056 & Time 2: n = 592) in the USA and Western Europe. The results showed that a unidimensional factorial model of overall ASE fitted the data well was reliable and invariant across time. Further, criterion validity was established by finding a positive relationship with task performance at different time stamps. Therefore, the GASE can be used as a valid and reliable measure for general ASE.
Published: 2022
Full Text: View/download PDF

20. Online Computerized Adaptive Tests of Children's Vocabulary Development in English and Mexican Spanish

Author: Kachergis, George, Marchman, Virginia A., Dale, Philip S., Mankewitz, Jessica, and Frank, Michael C.
Abstract: Purpose: Measuring the growth of young children's vocabulary is important for researchers seeking to understand language learning as well as for clinicians aiming to identify early deficits. The MacArthur-Bates Communicative Development Inventories (CDIs) are parent report instruments that offer a reliable and valid method for measuring early productive and receptive vocabulary across a number of languages. CDI forms typically include hundreds of words, however, and so the burden of completion is significant. We address this limitation by building on previous work using item response theory (IRT) models to create computer adaptive test (CAT) versions of the CDIs. We created CDI-CATs for both comprehension and production vocabulary, for both American English and Mexican Spanish. Method: Using a data set of 7,633 English-speaking children ages 12-36 months and 1,692 Spanish-speaking children ages 12-30 months, across three CDI forms (Words & Gestures, Words & Sentences, and CDI-III), we found that a 2-parameter logistic IRT model fits well for a majority of the 680 pooled vocabulary items. We conducted CAT simulations on this data set, assessing simulated tests of varying length (25-400 items). Results: Even very short CATs recovered participant abilities very well with little bias across ages. An empirical validation study with N = 204 children ages 15-36 months showed a correlation of r = 0.92 between language ability estimated from full CDI versus CDI-CAT forms. Conclusion: We provide our item bank along with fitted parameters and other details, offer recommendations for how to construct CDI-CATs in new languages, and suggest when this type of assessment may or may not be appropriate.
Published: 2022
Full Text: View/download PDF

21. The Me and My School Questionnaire: Examining the Cross-Cultural Validity of a Children's Self-Report Mental Health Measure

Author: Moffa, Kathryn, Wagle, Rhea, Dowdy, Erin, Palikara, Olympia, Castro-Kemp, Susana, Dougherty, Danielle, and Furlong, Michael J.
Abstract: The Me and My School Questionnaire (M&MS) is a brief self-report measure of elementary school students' (ages 8-12) social, emotional, and behavioral challenges. As there is a need for brief self-report screening measures for students in elementary, or primary, school, this study examined the factor structure and measurement invariance of the M&MS for elementary school students in the United States (U.S.; N = 784) and the U.K. (N = 538). Results replicated its two-factor structure (emotional difficulties and behavioral difficulties) with both samples. Convergent and discriminant validity, test-retest reliability, and measurement invariance for girls and boys were examined in the U.S. sample. Partial measurement invariance was established when comparing factor structures of the U.S. and U.K. samples. Implications for mental health monitoring, and for comparative international research are discussed. [This is the online version of an article published in "International Journal of School and Educational Psychology."]
Published: 2019
Full Text: View/download PDF

22. Preliminary Investigation of the Psychological Sense of School Membership Scale with Primary School Students in a Cross-Cultural Context

Author: Wagle, Rhea, Dowdy, Erin, Yang, Chunyan, Palikara, Olympia, Castro, Susana, Nylund-Gibson, Karen, and Furlong, Michael J.
Abstract: The "Psychological Sense of School Membership" (PSSM) scale has been used for more than 20 years to measure students' sense of school belonging, yet its psychometric properties have had limited examination with pre-adolescent children. This study investigated the utility and psychometrics of the PSSM in three primary school samples from the United States, China, and the United Kingdom. Exploratory factor analysis revealed good fit for a unidimensional factor structure in the U.S. sample, which was subsequently confirmed in all three samples. Partial invariance across all three samples and full invariance across pairwise samples (United States and United Kingdom; United Kingdom and China) was found. Path analyses revealed significant positive relations of the PSSM total belonging score with gratitude and prosocial behavior, and significant negative relations with symptoms of distress. Future directions and implications are discussed. [This paper will be published in "School Psychology International."]
Published: 2018

23. Pedagogical Support for the Test of Gross Motor Development -- 3 for Children with Neurotypical Development and with Autism Spectrum Disorder: Validity for an Animated Mobile Application

Author: Copetti, Fernando, Valentini, Nadia C., Deslandes, Andréa C., and Webster, E. Kipling
Abstract: Background: Motor skill assessment is time-consuming and some difficulties are inherent in the administration of motor tests, especially in children with neurodevelopment disorders. Purpose: This study aimed to develop and investigate the face, content, and criterion validity of a Motor Skills Sequential Pictures (MSSP) for the Test of Gross Motor Development -- 3 (TGMD-3) to be animated and used in a mobile application (App). Methods: The MSSP was created representing each of the 13 TGMD-3 skills, performance criteria and accuracy was assessed by 23 experts, 52 undergraduate students, and 66 children (range 3-10 years; n = 48 with neurotypical development, n = 18 with Autism Spectrum Disorder, ASD). We conducted two rounds of MSSP expert evaluations to improve the MSSP accuracy. Content validity was conducted with the experts' results using percentage of agreement, content validity coefficient (CVC), kappa, and Chi[superscript 2] tests for the first and final version of the MSSP. University students participated in the face validity evaluation of the MSSP final version using percentage of agreement. Further content validity was conducted with experts and university students' scores using Chi[superscript 2]. Children participated in the last phase of the study and were requested to identify and perform the skills, and if unsuccessful, they received verbal support based on the motor performance criteria. Results: For content validity results associated with the experts' agreement, scores were high and increased from the first to the second round (CVC from 87.0% to 96.1%; Kappa coefficient >0.60, p > 0.0001). High agreement was obtained for the face validity of all skills (range 94.1-100%). Further, significant associations were found for experts and university students scores for the MSSP final version (p [less than or equal to] 0.002), providing further evidence for the MSSP content validity. The results for children with neurotypical development showed that children aged 3-4 had more difficulties in identifying the skills compared to older children. Developmental criterion validity was found for several skills (hop, jump, slide, one-hand strike, two-hand strike; p values from <0.0001 to .050); the MSSP was a more robust support as children age. In the ASD group, identifying skills was difficult for all ages, but mainly in locomotor skills. Furthermore, an inverse trend was found for the developmental validity criteria for children with ASD for several skills (sSkip, jump, slide, catch, kick; p values from .016 to .050), younger children relied more on the MSSP support to identify the skills. Conclusion: The MSSP, mainly ball skills, proved to be valid to illustrate the TGMD-3 motor performance criteria and may be useful as a visual pedagogical support for children to facilitate skill understanding. Future directions will be to evaluate whether the MSSP animation, in an app-based program, will improve children's motor skill performance.
Published: 2022
Full Text: View/download PDF

24. Test Efficacy: Refocusing Validation from College Exams to Candidates

Author: Arce, Alvaro J. and Young, Michael J.
Abstract: The paper argues that contemporary test validity theory places the consequences of testing on the lives of all college applicants at the back of the test validation argument. It introduces the notion of test efficacy as a process to gather evidence on claims on consequences of testing on all college applicants that can be traced back to validity. The paper proposes a test efficacy framework to evaluate test efficacy claims on the impact of admission examinations on all college applicants (not just those attaining the admission standard).
Published: 2022
Full Text: View/download PDF

25. Core Academic Language Skills: Validating a Construct in Linguistically Dissimilar Settings

Author: MacFarlane, Marco, Barr, Christopher, and Uccelli, Paola
Abstract: Recently a novel instrument -- the Core Academic Language Instrument (CALS-I) -- aimed at testing a constellation of school-relevant English language skills was developed and validated for use in the United States (Uccelli, Barr, Dobbs, Galloway, Meneses and Sánchez, 2015. Core Academic Language Skills: An expanded operational construct and a novel instrument to chart school-relevant language proficiency in preadolescent and adolescent learners. "Applied Psycholinguistics" 36: 1077-1109). The unitary construct tested by the CALS-I was dubbed Core Academic Language Skills (CALS) and it aimed to identify and describe a set of skills that comprise academic language proficiency of high utility across curricular content areas. This study piloted a version of the CALS-I that was slightly modified for use in South Africa targeting specifically middle-school learners. The results of this study reveal that the CALS construct functions almost identically to the United States sample when tested in South Africa, and this provides strong evidence for the fundamental and cross-cutting nature of these pedagogically relevant skills.
Published: 2022
Full Text: View/download PDF

26. Honor, Face, and Dignity Norm Endorsement among Diverse North American Adolescents: Development of a Social Norms Survey

Author: Frey, Karin S., Onyewuenyi, Adaurennaya C., Hymel, Shelley, Gill, Randip, and Pearson, Cynthia R.
Abstract: This article examined the psychometric properties and validity of a new self-report instrument for assessing the social norms that coordinate social relations and define self-worth within three normative systems. A survey that assesses endorsement of honor, face, and dignity norms was evaluated in ethnically diverse adolescent samples in the U.S. (Study 1a) and Canada (Study 2). The internal structure of the survey was consistent with the conceptual framework, but only the honor and face scales were reliable. Honor endorsement was linked to self-reported retaliation, less conciliatory behavior, and high perceived threat. Face endorsement was related to anger suppression, more conciliatory behavior, and, in the U.S., low perceived threat. Study 1b examined identity-relevant emotions and appraisals experienced after retaliation and after calming a victimized peer. Honor norm endorsement predicted pride following revenge, while face endorsement predicted high shame. Adolescents who endorsed honor norms thought that only avenging their peer had been helpful and consistent with the role of good friend, while those who endorsed face norms thought only calming a victimized peer was helpful and indicative of a good friend. Implications for adolescent welfare are discussed.
Published: 2021
Full Text: View/download PDF

27. The Me and My School Questionnaire: Examining the Cross-Cultural Validity of a Children's Self-Report Mental Health Measure

Author: Moffa, Kathryn, Wagle, Rhea, Dowdy, Erin, Palikara, Olympia, Castro, Susana, Dougherty, Danielle, and Furlong, Michael J.
Abstract: The Me and My School Questionnaire (M&MS) is a brief self-report measure of elementary school students' (ages 8-12) social, emotional, and behavioral challenges. As there is a need for brief self-report screening measures for students in elementary, or primary, school, this study examined the factor structure and measurement invariance of the M&MS for elementary school students in the United States (U.S.; N = 784) and the U.K. (N = 538). Results replicated its two-factor structure (emotional difficulties and behavioral difficulties) with both samples. Convergent and discriminant validity, test-retest reliability, and measurement invariance for girls and boys were examined in the U.S. sample. Partial measurement invariance was established when comparing factor structures of the U.S. and U.K. samples. Implications for mental health monitoring, and for comparative international research are discussed. [For the corresponding grantee submission, see ED603450.]
Published: 2021
Full Text: View/download PDF

28. The SchoolWeavers Tool: Supporting School Leaders to Weave Learning Ecosystems

Author: Díaz-Gibson, Jordi, Daly, Alan, Miller-Balslev, Gitte, and Zaragoza, Mireia Civís
Abstract: Social capital has recently emerged as an effective approach to rethink schools as wider learning ecosystems where students, teachers and families have greater access to learning resources through social interaction. Literature has not provided research-based assessment tools that document school leaders' abilities to weave social relationships between actors within the school and across the community. This paper presents an international experts' validation of the SchoolWeavers Tool, an online resource that supports school leaders to assess the health and potential of their school ecosystem and provides meaningful feedback to weave social and professional capital and lift learning opportunities and educational goals. Theoretical validation was conducted in the first round by 15 experts from 8 countries with prior experience in network leadership in education, and in the second round, with 54 school actors from the same 8 countries. The final model provides an internationally validated tool that supports school leaders' capacities to improve collective effectiveness, internal and external collaboration, innovation and equity. Furthermore, the Tool creates research opportunities by allowing school leaders and researchers to collaborate and support systemic impact and sustainable improvement.
Published: 2021
Full Text: View/download PDF

29. ME-Q: Multicultural Education Questionnaire: Assessing the Experience of Multicultural Education in Academic Institutions

Author: Finkelstein, Idit, Hartman, Tova, and Freier-Dror, Yossi
Abstract: Although there is a great deal of writing about the various aspects of multiculturalism in higher education, especially about how to start implementing it on the campus, there is still a need to develop significant and credible assessment tools to examine the extent in which the goals of multiculturalism are being achieved. The questionnaire used in this study was created with this goal of assessment in mind; it provides schools with a tool that enables them to understand and improve upon the multicultural aspects of their institutions. The questionnaire was built based on the theoretical conceptions of multicultural educational research as well as on focus groups in which students and faculty discussed their experiences. The following six domains were measured: respect, acceptance of others, loss and change of identity, privilege, cultural accommodation, and Culturally Responsive Teaching/Culturally Relevant Education. The sample consisted of 314 students in five campuses on Ono Academic College in Israel and the United States. The assessment tool was found to be accurate and valid, and it thus may be reliably used to evaluate cultural diversity in an academic environment. At the micro level, using this questionnaire will give schools a keen insight into student experience and help to identify where intervention is needed in order to promote cultural diversity. At the macro level, the questionnaire formulated in this study allows institutions to gain a good understanding of the overall multicultural climate of the institution and to formulate clear objectives toward improving multicultural education.
Published: 2021
Full Text: View/download PDF

30. Age and Gender Invariance in the Taiwan Wechsler Intelligence Scale for Children, Fifth Edition: Higher Order Five-Factor Model

Author: Chen, Hsinyi, Zhu, Jianjun, Liao, Yung-Kun, and Keith, Timothy Z.
Abstract: This study investigated the factorial invariance of the Taiwan Wechsler Intelligence Scale for Children, Fifth Edition (WISC-V) across age and gender. A higher order five-factor model was tested on a nationally representative sample of 1,034 children aged 6-16 years. The results demonstrated full factorial invariance for Taiwan children of different ages and gender. The WISC-V subtests demonstrated the same underlying theoretical latent constructs, strength of relations among factors and subtests, validity of each first-order factor, and communalities, regardless of age and gender, which supported the same interpretive approach of the WISC-V. These results accord with findings in the United States, indicating a full factorial invariance of the WISC-V five-factor structure across ages and gender.
Published: 2020
Full Text: View/download PDF

31. The Financial Identity Scale (FIS): A Multinational Validation and Measurement Invariance Study among Emerging Adults

Author: Sorgente, Angela, Vosylis, Rimantas, Lanz, Margherita, Serido, Joyce, and Shim, Soeyon
Abstract: The transition from financial dependence on one's parents to financial self-sufficiency is one of the most relevant transitions during emerging adulthood. It is important to have an instrument able to assess emerging adults' financial capabilities and to detect its change over time. The current article aims to collect international evidence of the Financial Identity Scale (FIS) validity and reliability. Cross-sectional data collected from 2,501 emerging adults aged 18--25 and belonging to three different countries--U.S. (n = 1,535), Italy (n = 485), and Lithuania (n = 481)--were adopted to test score structure validity, generalizability, sensitivity to difference, criterion-related validity, and internal consistency. Instead, four-wave longitudinal data, available for the American sample only (n = 1,900), were adopted to test FIS structural stability and sensitivity to change. As recommended by the contemporary view of validity, different structural equation models were performed. Findings suggest that FIS scores are valid and reliable. The implications for researchers and practitioners are discussed.
Published: 2020
Full Text: View/download PDF

32. The Redesigned SAT® Pilot Predictive Validity Study: A First Look. Research Report 2016-1

Author: College Board, Shaw, Emily J., Marini, Jessica P., Beard, Jonathan, Shmueli, Doron, Young, Linda, and Ng, Helen
Abstract: In February of 2013, the College Board announced it would undertake a redesign of the SAT® in order to develop an assessment that better reflects the work that students will do in college, focusing on the core knowledge and skills that evidence has shown to be critical in preparation for college and career. The redesigned test will be introduced in March 2016 and will include a number of important changes. As with the redesign of all assessments, it is important to examine and understand how the changes to the content and format of the test impact the inferences made from the test's scores for their intended uses. One primary use of the SAT is for admission and placement decisions and, therefore, it was important to examine the relationship between the scores from the redesigned test with college outcomes such as first-year grade point average (FYGPA) and college course grades. In order to conduct such an analysis a pilot study was initiated because the test is not yet operational. Fifteen four-year institutions were recruited to administer a pilot form of the redesigned SAT to between 75 and 250 first-year, first-time students very early in the fall semester of 2014. Measures were taken to ensure that the redesigned SAT was administered to students under standardized conditions and that students were motivated to perform well on the test. In June 2015, participating institutions provided the College Board with first-year performance data for those students participating in the fall 2014 administration of the redesigned SAT so that relationships between SAT scores and college performance could be analyzed. Results of study analyses show that the redesigned SAT is as predictive of college success as the current SAT, that redesigned SAT scores improve the ability to predict college performance above high school GPA alone, and that there is a strong, positive relationship between redesigned SAT scores and grades in matching college course domains, suggesting that the redesigned SAT is sensitive to instruction in English language arts, math, science, and history/social studies.
Published: 2016

33. International Test Score Comparisons and Educational Policy: A Review of the Critiques

Author: University of Colorado at Boulder, National Education Policy Center and Carnoy, Martin
Abstract: Stanford education professor Martin Carnoy examines four main critiques of how international test results are used in policymaking. Of particular interest are critiques of the policy analyses published by the Program for International Student Assessment (PISA). Using average PISA scores as a comparative measure of student achievement is misleading for a number of reasons, Carnoy maintains: (1) Students in different countries have different levels of family academic resources; (2) The larger gains reported on the Trends in International Mathematics and Science Study (TIMSS), which is adjusted for different levels of family academic resources, raise questions about the validity of the PISA results when used for international comparisons; (3) PISA test score error terms are "considerably larger" than the testing agencies acknowledge, making the country rankings unstable; and (4) The Shanghai educational system is held up as a model for the rest of the world on the basis of non-representative data. Of further concern is the conflict of interest arising from the Organization for Economic Cooperation and Development (which administers the PISA) and its member governments acting as a testing agency while simultaneously serving as data analyst and interpreter of results for policy purposes. Carnoy considers the critiques within a discussion of the underlying social meaning and education policy value of international comparisons in general. He describes why using average national math scores as predictors of future economic growth is problematic, and points out that using scoring data in this manner has limited use for establishing education policy because causal inferences can not be meaningfully drawn. Finally, Carnoy explores the relevance of nation-level test score comparisons among countries such as the United States with diverse and complex education systems. The differences between states in the U.S. are, for example, so large that employing U.S. state-level test results over time to examine the impact of education policies would be more useful and interesting than using combined U.S. data. Despite valid critiques of international test result comparisons, Carnoy argues that the comparisons will neither go away nor stop being inappropriately used to shape educational policy. He concludes with five policy recommendations to reduce the misuse of testing data. (A list of notes and references is included.)
Published: 2015

34. Spanish and Italian Translations for the 'Merlino-Perkins Father-Daughter Relationship Inventory' (MP-FDI); Construction, Reliability, Validity, and Implications for Counseling and Research

Author: Merlino-Perkins, Rose, Martinez, Jose, Barbagallo-Gregory, Carmen, and Barbagallo, Antonio
Abstract: "Merlino-Perkins Father-Daughter Relationship Inventory" (Perkins, 2008), written in a woman's voice, provides counselors with a vehicle that helps women awaken subtle dynamics unique to their childhood father relationships. Accordingly, with counselor guidance, women have the opportunity to grieve their past, and celebrate what may be possible for them in the present. However, international events have caused many women (not fluent in English) to seek life in a new land, often a country with a new language. Consequently, this research responds to the overall need for published assessment instruments to be accurately translated, (employing strong psychometric properties) for women who prefer to read, or can only read, in their first language. Thus, this work provides findings shown, and counseling implications as well.
Published: 2020

35. Assessing Hope in Student Veterans

Author: Umucu, Emre, Moser, Erin, and Bezyak, Jill
Abstract: Hope is a defining characteristic of well-being, and research points to the positive contribution of hope to life adjustment (Snyder, Lehman, Kluck, & Monssan, 2006). Harris developed an initial version of an individual-differences measure of hope, the Trait Hope Scale (THS). Snyder, Harris, and colleagues (1991) further developed the THS by shortening the original version while retaining both the agency and pathways subcomponents. To further explore the role of hope in adjustment to college life among student veteran populations, it is necessary to validate the THS among this population. This investigation of psychometric properties of the THS among student veterans yielded results supporting the two-factor structure of the THS, and Cronbach's alpha estimates that indicate internal consistency for both subscales (pathways thinking and agency thinking). In addition, the external relationships significantly correlated with pathways thinking and agency thinking in the expected directions. Additional investigation in this area is warranted to expand understanding of these relationships along with the existence of other relationships.
Published: 2020
Full Text: View/download PDF

36. The Adaptation and Psychometric Examination of a Social-Emotional Developmental Screening Tool in Taiwan

Author: Chen, Chieh-Yu, Squires, Jane, Chen, Ching-I, Wu, Rachel, and Xie, Huichao
Abstract: "Research Findings": The Ages & Stages Questionnaires: Social-Emotional, Second Edition (ASQ:SE-2) was translated and adapted into Traditional Chinese in Taiwan. A sample of 1,455 children, ranging from 42 months 0 days to 53 months 30 days old, reflecting the population sizes of different regions in Taiwan, completed the 48-month ASQ:SE-2. Data were analyzed by item response theory modeling. A multidimensional Rasch Partial Credit Model was chosen for data analysis. Differential item functioning (DIF) was used to explore the difference between the Traditional Chinese ASQ:SE-2 and the English ASQ:SE-2 (N = 3,005) administered in the U.S. Results indicated that (a) item fit statistics was between 0.88 -- 1.26 (M = 1.00, SD = 0.10), (b) difficulty was between -0.79 -- 3.19 (M = 2.06, SD = 0.84), (c) reliability was 0.79 for all items, 0.75/0.74 for Emotion/Sociality dimension, and (d) six out of 35 items (17.1%) with moderate to large DIF. "Practice or Policy": This research provided psychometric evidence for using the ASQ:SE-2-TC with a Taiwanese population. The promising psychometric findings encourage the further investment on validating ASQ:SE-2-TC. The cultural explanations can inform test developers to become aware of the potential influence from social values, parenting style, or childrearing practices.
Published: 2020
Full Text: View/download PDF

37. Development of the Altruism Scale for Children: An Assessment of Caring Behaviors among Children

Author: Swank, Jacqueline M., Limberg, Dodie, and Liu, Ren
Abstract: This article focuses on the development of the Altruism Scale for Children (ASC). Analyses revealed a one-factor model with internal consistency of 0.89 and test-retest reliability of 0.94. The authors also discuss the implications for using the instrument for assessing the need for interventions and measuring program outcomes.
Published: 2020
Full Text: View/download PDF

38. The Child Focused Injury Risk Screening Tool (ChildFIRST) for 8-12-Year-Old Children: A Validation Study Using a Modified Delphi Method

Author: Jimenez-Garcia, John Alexander, Hong, Chang Ki, Miller, Matthew B., and DeMont, Richard
Abstract: The purpose of this Delphi-study was to establish the face and content validity of 10 movement skills, each with four evaluation criteria, to create the Children Focused Injury Risk Screening Tool (ChildFIRST) for 8-12-year-old children. We asked an international expert panel (n = 22) to validate a series of movement skills and evaluation criteria. This Delphi-process consisted of three rounds. In the first two rounds, the experts scored the movement skills and evaluation criteria using 5-point Likert scales. Consensus on validating an item was achieved when 75% or more of the experts scored "Agree" or "Strongly Agree." In the third-round, the experts ranked and established the final list with the validated movement skills and evaluation criteria. This study provided preliminary validity evidence for 10 movement skills, each with four evaluation criteria, to create the ChildFIRST. The ChildFIRST is designed to be used to evaluate movement competence and risk of musculoskeletal injury.
Published: 2020
Full Text: View/download PDF

39. Do You Understand What I Mean? How Cognitive Interviewing Can Strengthen Valid, Reliable Study Instruments and Dissemination Products

Author: Hofmeyer, Anne, Sheingold, Brenda Helen, and Taylor, Ruth
Abstract: It is now well accepted that working in research teams that span universities, jurisdictions and countries can be rewarding and economically prudent. To this end, investigators collaborate in the pursuit of knowledge to address human and societal problems and translate results into local and global contexts. This implies that investigators need to develop study instruments that are fit for purpose and strategically manage issues arising from geographical, linguistic and cultural diversity. A proven method is cognitive interviewing to pre-test the study materials to ensure clarity and relevance in the study population. This paper describes the steps taken to increase the methodological reliability of study instruments through the use of cognitive interviewing and argues this technique should be a standard step in instrument development.
Published: 2015

40. Examining the 'WorkFORCE'™ Assessment for Job Fit and Core Capabilities of 'FACETS'™. Research Report. ETS RR-14-32

Author: Naemi, Bobby, Seybert, Jacob, Robbins, Steven, and Kyllonen, Patrick
Abstract: This report introduces the "WorkFORCE"™ Assessment for Job Fit, a personality assessment utilizing the "FACETS"™ core capability, which is based on innovations in forced-choice assessment and computer adaptive testing. The instrument is derived from the fivefactor model (FFM) of personality and encompasses a broad spectrum of personality assessment. This document provides an overview of the assessment, beginning with detailing evidence-based practices for personality measurement and modeling its relationship to workplace outcomes.We address the validity and fairness of this assessment, the creation of composite scores, and the generalizability of the assessment across languages and job types. We conclude with recommendations on the use of this capability for workforce applications and guidelines for future research.
Published: 2014

41. Assessing Students' IT Professional Values in a Global Project Setting

Author: Frezza, S., Daniels, M., and Wilkin, A.
Abstract: This research aimed at evaluating the development and use of low-cost affective domain assessment instruments, culminating with personal and group characterization of representative global information technology (IT) professional values. Values and valuing are a compelling component of Bloom's affective domain of learning for engineering education. In helping students develop professional engineering competencies, it is essential that they develop not just cognitive knowledge of something but also values related to that knowledge and the ability to express these values in professional action. However, even if some professional values are identified, understood, and expressed, assessing students' values and valuing are difficult, and assessment instruments are often difficult to develop, particularly for assessing student learning in the context of a particular course. This exploratory study aimed at examining assessment of dispositional knowledge in the context of global software engineering (GSE). It focused on the development and use of a set of instruments for assessing affective domain student learning of global IT/software engineering (SE) professional values. The project included making explicit the IT professional values of interest among the participating faculty in the form of actionable value statements. Following a process derived from Thurstone scale development, the project included validation of these statements with an expert panel as question roots, followed by the use of these questions to investigate student and alumni receiving, responding, and valuing of these professional values. The effort needed to generate questionnaires suitable for course use was relatively low; these questionnaires were deployed to students and alumni from an open-ended global software engineering project course. Students responding reported significant agreement when receiving these global values, but sent more mixed responses in responding to and valuing them. The effort helped identify several actionable IT professional values worth reinforcing in future course offerings.
Published: 2019
Full Text: View/download PDF

42. Development of the International Ocean Literacy Survey: Measuring Knowledge across the World

Author: Fauville, Géraldine, Strang, Craig, Cannady, Matthew A., and Chen, Ying-Fang
Abstract: The Ocean Literacy movement began in the U.S. in the early 2000s, and has recently become an international effort. The focus on marine environmental issues and marine education is increasing, and yet it has been difficult to show progress of the ocean literacy movement, in part, because no widely adopted measurement tool exists. The International Ocean Literacy Survey (IOLS) aims to serve as a community-based measurement tool that allows the comparison of levels of ocean knowledge across time and location. The IOLS has already been subjected to two rounds of field testing. The results from the second testing, presented in this paper, provide evidence that the IOLS is psychometrically valid and reliable, and has a single factor structure across 17 languages and 24 countries. The analyses have also guided the construction of a third improved version that will be further tested in 2018.
Published: 2019
Full Text: View/download PDF

43. Linking and Comparing Short and Full-Length Concept Inventories of Electricity and Magnetism Using Item Response Theory

Author: Xiao, Yang, Fritchman, Joseph C., Bao, Jacqueline Y., Nie, Ying, Han, Jing, Xiong, Jianwen, Xiao, Hua, and Bao, Lei
Abstract: In physics education research (PER), concept inventories (CIs) have become standard instruments for assessing students' learning throughout instruction. To promote widespread use of concept inventories, previous studies have developed an approach to split a full length CI into short versions of CIs. This research extends the existing method to fully utilize the item response theory framework in equating and linking between the short CIs and the full length CIs. Three quantitative studies have been conducted: First, the extended algorithm is applied to divide the Brief Electricity and Magnetism Assessment (BEMA) into two half-length BEMAs (HBEMAs). Through a series of test-equating and validation analysis, the HBEMAs are confirmed to measure the same latent constructs of student understanding of electricity and magnetism to that of the original BEMA at a similar level of reliability. The second study establishes equivalent score conversions among the three versions of BEMA using the Stocking-Lord method, which has the best performance on equating error reduction among several methods explored. It is also confirmed that the equivalent statistical characteristics of the three versions of BEMA are equity and population invariant. In the third study, the extended algorithm is applied to link and compare the BEMA and the Conceptual Survey of Electricity and Magnetism (CSEM). After linking the BEMA and CSEM assessment scales, it becomes possible to directly convert and compare students' performances on the two CIs. It is found that the scales of BEMA and CSEM are almost identical after scale transformation. Based on these studies, it can be suggested that all short and long versions of BEMA and CSEM can be used interchangeably after scale transformation.
Published: 2019
Full Text: View/download PDF

44. Measuring Relationship Quality in an International Study: Exploratory and Confirmatory Factor Validity

Author: Chonody, Jill M., Gabb, Jacqui, Killian, Mike, and Dunk-West, Priscilla
Abstract: Objective: This study reports on the operationalization and testing of the newly developed Relationship Quality (RQ) scale, designed to assess an individual's perception of his or her RQ in their current partnership. Methods: Data were generated through extended sampling from an original U.K.-based research project, "Enduring Love? Couple relationships in the 21st century." This mixed methods study was designed to investigate how couples experience, understand, and sustain their long-term relationships. This article utilizes the cross-sectional, community sample (N = 8,132) from this combined data set, drawn primarily from the United Kingdom, United States, and Australia. A two-part approach to scale development was employed. An initial 15-item pool was subjected to exploratory factor analysis leading into confirmatory factor analysis using structural equation modeling. Results: The final 9-item scale evidenced convergent construct validity and known-groups validity along with strong reliability. Conclusion: Implications for future research and professional practice are discussed.
Published: 2018
Full Text: View/download PDF

45. Examining Undergraduate Students' Attitudes toward Business Statistics in the United States and China

Author: Wang, Ping, Palocsay, Susan W., Shi, Jinyan, and White, Marion M.
Abstract: The rapid growth of analytics is bringing more attention to quantitative core curriculum requirements in undergraduate business programs. Statistical knowledge and skills are unequivocally recognized as essential cornerstone of business analytics. Furthermore, educational research has shown that academic performance in statistics classes is related to the attitudes that students bring to the course. This article assesses the reliability and validity of the "Survey of Attitudes toward Statistics" ("SATS") in measuring noncognitive dimensions of attitudes among undergraduate business students. Sample data from U.S. and Chinese introductory business statistics classes were collected and analyzed to learn more about this aspect of student engagement across business schools located in countries with substantially different levels of success in international mathematics achievement testing, as well as differing cultural and educational practices. Results show that the six-factor model structure of the SATS provides a good fit in both populations, with students entering business statistics holding only slightly positive attitudes toward the subject. Significant distinctions between four of the six attitude components were identified. Implications of measuring and improving these attitudes are discussed. Business statistics instructors are encouraged to use the survey as a standardized instrument to measure effects of interventions and make evidence-based pedagogical decisions.
Published: 2018
Full Text: View/download PDF

46. Comparative Study of Middle School Students' Attitudes towards Science: Rasch Analysis of Entire TIMSS 2011 Attitudinal Data for England, Singapore and the U.S.A. as Well as Psychometric Properties of Attitudes Scale

Author: Oon, Pey Tee and Subramaniam, R.
Abstract: We report here on a comparative study of middle school students' attitudes towards science involving three countries: England, Singapore and the U.S.A. Complete attitudinal data sets from TIMSS (Trends in International Mathematics and Science Study) 2011 were used, thus giving a very large sample size (N = 20,246), compared to other studies in the journal literature. The Rasch model was used to analyse the data, and the findings have shed some useful light on not only how the Western and Asian students responded on a comparative basis in the various scales related to attitudes but also on the validity, reliability, and unidimensionality of the attitudes instrument used in TIMSS 2011. There may be a need for TIMSS test developers to consider doing away with negatively phrased items in the attitudes instrument and phrasing these positively as the Rasch framework shows that response bias is associated with these statements.
Published: 2018
Full Text: View/download PDF

47. Game-Like Tablet Assessment of Approaches to Learning: Assessing Mastery Motivation and Executive Functions

Author: Józsa, Krisztián, Barrett, Karen Caplovitz, and Morgan, George A.
Abstract: Introduction: School readiness predicts both school and life success, so measuring it effectively is extremely important. Current school readiness tests focus on pre-academic skills; however, mastery motivation (MM: persistent, focus on trying to do a task) and executive functions (EF: planful self-control) are also crucial. Method: The purpose of the paper is to give an overview of a new, computer-based assessment of MM and EF. Results: We have developed a game-like, computer-based assessment for 3 to 8 year-old children, of MM, EF, and recognition of numbers and letters. The new measures are appropriate for both Hungarian and American cultures. They were engaging for children of this age, and preliminary evidence suggests that they are reliable and valid. Conclusion: The new tasks can be part of assessments of school readiness, and would be useful for school practice as well as researchers. The tasks ascertain the extent to which observed deficits in pre-academic domains are due to MM or EF difficulties. The results will contribute to the development of individualized intervention.
Published: 2017
Full Text: View/download PDF

48. Examination of a Social-Networking Site Activities Scale (SNSAS) Using Rasch Analysis

Author: Alhaythami, Hassan, Karpinski, Aryn, Kirschner, Paul, and Bolden, Edward
Abstract: This study examined the psychometric properties of a social-networking site (SNS) activities scale (SNSAS) using Rasch Analysis. Items were also examined with Rasch Principal Components Analysis (PCA) and Differential Item Functioning (DIF) across groups of university students (i.e., males and females from the United States [US] and Europe; N = 840). Results from this study found psychometric support for the SNSAS to measure endorsement of activities in which university students engage. European males and females were more likely to endorse actively using SNSs compared to their more passive US counterparts. The SNSAS is a reliable and valid tool in a growing research area where few exist. University administrators worldwide can benefit from understanding the patterns of SNS behavior that may impact students' socialization, relationship formation, and academic outcomes.
Published: 2017

49. Organizational Strategic Learning Capability: Exploring the Dimensions

Author: Moon, Hanna, Sejong, Wendy, and Valentine, Tom
Abstract: Purpose: How to build and enhance the strategic learning capability (SLC) of an organization becomes crucial to both research and practice. This study was designed with the purpose to conceptualize SLC by translating and interpreting the related literature to develop empirical dimensions that could be tested and used in a survey instrument. Design/methodology/approach: An instrument was developed to identify empirical dimensions of SLC. The reliability and validity of the instrument were tested. Findings: The resulting survey instrument included 59 items, and 49 remained after empirical test. Based on responses on a five-point performance scale, SLC items were identified and prioritized, and seven dimensions were discovered: external focus, strategic dialogue, strategic engagement, customer-centric strategy, disciplined imagination, experiential learning and reflective responsiveness. Originality/value: The findings of this study extend the knowledge base of multi-disciplines, including strategy management, organizational learning and strategic human resource development (HRD). This study highlights the conceptualization of SLC and importance of the SLC framework in the field of HRD.
Published: 2017
Full Text: View/download PDF

50. Introducing the Postsecondary Instructional Practices Survey (PIPS): A Concise, Interdisciplinary, and Easy-to-Score Survey

Author: Walter, Emily M., Henderson, Charles R., Beach, Andrea L., and Williams, Cody T.
Abstract: Researchers, administrators, and policy makers need valid and reliable information about teaching practices. The Postsecondary Instructional Practices Survey (PIPS) is designed to measure the instructional practices of postsecondary instructors from any discipline. The PIPS has 24 instructional practice statements and nine demographic questions. Users calculate PIPS scores by an intuitive proportion-based scoring convention. Factor analyses from 72 departments at four institutions (N = 891) support a 2- or 5-factor solution for the PIPS; both models include all 24 instructional practice items and have good model fit statistics. Factors in the 2-factor model include (a) instructor-centered practices, nine items; and (b) student-centered practices, 13 items. Factors in the 5-factor model include (a) student--student interactions, six items; (b) content delivery, four items; (c) formative assessment, five items; (d) student-content engagement, five items; and (e) summative assessment, four items. In this article, we describe our development and validation processes, provide scoring conventions and outputs for results, and describe wider applications of the instrument.
Published: 2016
Full Text: View/download PDF

Catalog

Books, media, physical & digital resources

See catalog results

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Language

Publication Type

Journal

Region

Database

Publisher

255 results on '"TEST validity"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources