Common structural design features of rubrics that may represent a threat to validity

Humphry, S. M., & Heldsinger, S. A. (2014). Common structural design features of rubrics that may represent a threat to validity. Educational Researcher, 43(5), 253-263.


Here is a news flash for all my fellow rubric-makers out there: we may be doing them wrong. Humphry and Heldsinger challenge the typical matrix structure of the rubrics most of us make and use to assess performance outcomes, and present convincing evidence that rubrics structured as matrices, that is, with a set of several criteria, all rated using the same number of scoring levels (usually three or four levels) have serious construct validity issues.

Most of the rubrics we see used today look like the familiar 3 X 3 or 4 X 4 or maybe 4 X 3 matrix, or perhaps some other number, but all criteria have the same number of levels. Humphry and Heldsinger maintain that such structures create the tendency for scorers to rate performances in patterns that tend to cluster learners into only a few categories, and that indicate a “halo effect” in which a rater’s overall impression dominates the whole scoring process. When you get the halo effect, you get patterns that cluster learners together and provide what these authors call “coarse-grained information” (p. 260) that cannot capture a complex construct very well and limits the usefulness of the assessment results.

To test their ideas, the authors tried revising a writing rubric used with Australian children. They tried removing some of the redundancy across the rating criteria, but did not see much improvement in the strong rating patterns. It was only when they deliberately used different numbers of scoring levels across criteria that the patterns stopped occurring and more useful, fine-grained analysis became possible. The new rubric no longer looked like a matrix, and all of the intuitive, boxy symmetry that we rubric-makers have known and loved was gone, but so were the frustrating scoring patterns that many of us have scratched our heads over, wondering why our supposedly so qualitative rubrics really weren’t telling us very much. We have struggled with this phenomenon in my own education school. We have several rubrics that we use for key formative and summative assessments, but they really don’t tell us as much about learner outcomes that all the effort we put into making and using those rubrics would seem to promise.

I for one can let go of the symmetry of the matrix if it means I will be better able to capture learner performance. After all, that is what assessment is all about. We use rubrics to capture complex learning processes, so it follows that maybe the instrument needs to be a bit more complex than we had thought in order to truly capture that complexity. I’ve used rubrics for a long time, and I’ve taught others to do so. Most of the time we have made them as matrices. But I must admit that I’ve always harbored doubts about them. The matrix-style rubrics imply that all of the criteria are of equal importance, and can be scored the same way. Yet I’ve looked at some of my own rubrics and wondered if that really was the case. I’ve tried adding differential weighting to the different criteria, which helped a bit, but my rubrics still weren’t capturing everything they could, and they weren’t always differentiating well between real levels of learning.

This article gives me permission to let go of the matrix and build independent scoring levels for each criterion that fit and make sense. If one criterion has six or seven levels and another has two or three, then so be it. The reason for assessment is to capture learning and gain useful information, not to create something that looks orderly and elegant but does not do what assessment needs to do. Let’s bring on the more “messy” rubrics!

I know I have colleagues who will resist this notion. The order and convenience of the little boxes in their symmetrical rows and columns remains seductive, and seems more intuitive than what these authors propose. I’m ready to try it, though. One thing that might have helped would be to see an example of a rubric that has different numbers of levels for each criterion. If I could see it, I’d have an easier time wrapping my own mind around it, and showing something would perhaps help convince colleagues and students. I’ll definitely be looking for examples, and plan to try this new kind of rubric soon myself and see what happens.

No comments:

Post a Comment