301. An Overview of Three Approaches to Scoring Written Essays by Computer. ERIC Digest.
- Author
-
ERIC Clearinghouse on Assessment and Evaluation, College Park, MD., Rudner, Lawrence, and Gagne, Phill
- Abstract
This digest describes the three most prominent approaches to essay scoring by computer: (1) Project Essay Grade (PEG), introduced by E. Page in 1966; (2) Intelligent Essay Assessor (IEA), introduced for essay grading in 1997 by T. Landauer and P. Foltz; and (3) e-rater, used by the Educational Testing Service and developed by J. Burstein. PEG grades essays primarily on the basis of writing quality. The underlying theory is that there are intrinsic qualities to a person's writing style called "trins" that need to be measured, analogous to true scores in measurement theory. PEG uses approximations of these variables, "proxes," to measure these traits. Research of more than 30 years shows consistently high correlations between PEG and human raters. IEA identifies which of several calibration documents is most similar to the new document. For essays, the average grade on the most similar calibration documents is assigned as the computer generated score. A similarity score is calculated to the essay column vector score relative to each column of the rubric matrix. As with PEG, high correlation has been found with human raters. E-rater is a "Hybrid Feature Technology" that uses syntactic variety, discourse structure (like PEG), and content analysis (like IEA). Several studies have reported favorably on these approaches. Compared to IEA and e-rater, PEG has the advantage of being conceptually simpler and less taxing on computer resources. All of these systems are proprietary and details of the exact process are not generally available. One should not expect perfect accuracy from any automated scoring approaches, but these technologies might be validation tools with each essay scored by one human and the computer. (SLD)
- Published
- 2001