Start Over

Assessing Scientific Practices Using Machine-Learning Methods: How Closely Do They Match Clinical Interview Performance?

Authors :: William J. Boone
Elizabeth P. Beggrow
Ross H. Nehm
Minsu Ha
Dennis K. Pearl
Source :: Journal of Science Education and Technology. 23:160-182
Publication Year :: 2013
Publisher :: Springer Science and Business Media LLC, 2013.
Abstract: The landscape of science education is being transformed by the new Framework for Science Education (National Research Council, A framework for K-12 science education: practices, crosscutting concepts, and core ideas. The National Academies Press, Washington, DC, 2012), which emphasizes the centrality of scientific practices— such as explanation, argumentation, and communication—in science teaching, learning, and assessment. A major chal- lenge facing the field of science education is developing assessment tools that are capable of validly and efficiently evaluating these practices. Our study examined the efficacy of a free, open-source machine-learning tool for evaluating the quality of students' written explanations of the causes of evolutionary change relative to three other approaches: (1) human-scored written explanations, (2) a multiple- choice test, and (3) clinical oral interviews. A large sample of undergraduates (n = 104) exposed to varying amounts of evolution content completed all three assessments: a clinical oral interview, a written open-response assessment, and a multiple-choice test. Rasch analysis was used to compute linear person measures and linear item measures on a single logit scale. We found that the multiple-choice test displayed poor person and item fit (mean square outfit (1.3), while both oral interview measures and computer- generated written response measures exhibited acceptable fit (average mean square outfit for interview: person 0.97, item 0.97; computer: person 1.03, item 1.06). Multiple- choice test measures were more weakly associated with interview measures (r = 0.35) than the computer-scored explanation measures (r = 0.63). Overall, Rasch analysis indicated that computer-scored written explanation mea- sures (1) have the strongest correspondence to oral inter- view measures; (2) are capable of capturing students' normative scientific and naive ideas as accurately as human-scored explanations, and (3) more validly detect understanding than the multiple-choice assessment. These findings demonstrate the great potential of machine-learn- ing tools for assessing key scientific practices highlighted in the new Framework for Science Education.

Subjects :: Rasch model
General Engineering
Educational technology
computer.software_genre
Science education
Education
Test (assessment)
Goodness of fit
Scale (social sciences)
Educational assessment
Item response theory
Mathematics education
Psychology
computer

Details

ISSN :: 15731839 and 10590145
Volume :: 23
Database :: OpenAIRE
Journal :: Journal of Science Education and Technology
Accession number :: edsair.doi...........845dffc01777f7f6251b89ad687a4e2d

Tools

Email
Cite

Printer

Authors Abstract Subjects Details

Searchworks

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources

Assessing Scientific Practices Using Machine-Learning Methods: How Closely Do They Match Clinical Interview Performance?

Abstract

Subjects

Details

Tools

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Assessing Scientific Practices Using Machine-Learning Methods: How Closely Do They Match Clinical Interview Performance?

Abstract

Subjects

Details

Tools

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources