Friday, July 31, 2015

Thinking about One Point Rubrics, Standards, and Dimensions

Yesterday (July 30), the UWP's first-year writing teachers got together and discussed the curriculum and changes to our new stretch program (TWRT 120 and 121). Autumn quarter will be the first quarter of the new stretch and stretch plus options for first-year students, which you can learn more about on our Web site. Because the courses are new, we got together to discuss them. We had an engaging set of discussions that ranged a number of topics, from our priorities around our program goals to activities we intend to scaffold our classes around.

Jennifer Gonzalez' Web site, Cult of
But our most interesting discussions, for me, were around grading. The new TWRT 120 course is a pass/fail course that uses a grading contract to determine course grades (CR or NC). I won't discuss contracts here, but save that for another day. During our conversation, one of our teachers, Caitlin Carle, offered us a nice blog post she found on what the post author, Jennifer Gonzalez, calls a "single point rubric." The article is on Brilliant or Insane: Education on the Edge, and Gonzalez is an education blogger and former teacher who runs her own web site and podcast for teachers (mostly secondary teachers), called Cult of Pedagogy

This rubric practice is one I've used since at least 2003-04, and published an article on the practice in Assessing Writing in 2005 ("Community-Based Assessment Practice"), which I've gotten many positive comments on over the years. The practice in my article shows how to build such a rubric with students in a cycle of assessment and reflection that is meant to involve students in every aspect of the assessment of their writing in the college classroom. I've changed my practice a bit since the article, but here I'd like to focus on the technology of rubrics, more specifically, what Gonzalez calls a single point rubric, and how it works with particular kinds of assessment ecologies (see my new book, Antiracist Writing Assessment Ecologies, to learn about assessment ecologies). I'd like to close on a change or shift in how my version of the single point rubric works in the ecology. 

Typical Rubrics
Most teachers and students think of rubrics for writing assignments as a list or table of expectations that have a score, ranking, or judgment attached to each dimension. So there might be five dimensions that are often represented by each row in a table, while each column offers a description of writing along each dimension that fits each different judgment or ranking possible. In this way, a teacher might use the rubric to provide individual scores on separate dimensions of the writing submitted, or add scores together to create an overall score or ranking. Below is a typical rubric (from Pittsburgh State University) in table form that offers four judgments, which could easily be ratings or scores (1-4). Each row describes a dimension of the writing that the reader (teacher) will make a judgment on. Such a rubric can be given to students to help explain grades on writing in a general way.

Note that this rubric, if used to explain grades, only explains grades, not how a student is doing (or has done) in a piece of writing. And if offers grade explanations in generalized terms, that do not connect them (or the teacher's reading) to evidence in the actual writing. The rubric expects the writer to understand those connections, or demands that the teacher include additional information and feedback. As the sole use of feedback, it simply shows the judgment, but can do very little in justifying or arguing that judgment to a student (how did the teacher come to it?). Where in the essay is one's support for claims "inadequate"? What is the nature of inadequacy in this paper's use of evidence? What is inadequate evidence exactly? Where is the line between "adequate" and "entirely appropriate"? The rubric cannot answer these important learning questions that students need to know.If you're using  a rubric like the above, and then adding further feedback so students can improve their writing, then you may be doing more than you need to, if your goal is to help students write better (as well as working against your own purposes by assigning a grade/judgment with feedback). Studies have shown how grades negatively affect students' abilities to use feedback on their writing, and harm their motivations. Alfie Kohn has shown these connections quite persuasively, and has many of his articles available on his web site, but for my money, his best argument comes in Punished by Rewards. If your goal is just to measure, then providing feedback with a rubric also may be doing too much. No need for the feedback.

Single Point Rubric
One solution, perhaps, is the single point rubric. Gonzalez describes the single point rubric as follows: 
Instead of detailing all the different ways an assignment deviates from the target, the single-point rubric simply describes the target, using a single column of traits. It’s what you’d find at level 3 on a 4-point scale, the “proficient” column, except now it’s all by itself. On either side of that column, there’s space for the teacher to write feedback about the specific things this student did that either fell short of the target (the left side) or surpassed it (the right).
So a single point rubric simply defines or describes the second column from the left in the above Pittsburgh State rubric. This gets rid of the other distinctions (judgments), and focuses the teacher's and the student's attention on the teacher's feedback in each dimension. It can also be easier for students to figure out and use because it is simpler. Her example below makes it clear how a teacher might use such a rubric with students. 

Gonzalez' Single Point Rubric

Only having to make a single, binary judgment on writing saves time because a teacher doesn't have to pine over too fine distinctions that become more dubious or arbitrary as the distinctions multiply. This problem is clearest when one must explain the difference between an essay that "earns" an 88 as opposed to an 87 or an 89, or even a 90. The single point rubric gets rid of most those judgments, which are summative, and focuses on formative judgments, the comments and feedback. In some significant ways, this rubric offers teachers who still use grades to simplify the assessment process and help students focus on feedback. And I think most, if not all, writing teachers would agree: we want our students to focus on our feedback, not grades. 

From Standards to Dimensions
A single point rubric is much easier to create with your students because the purpose of any rubric generating activity is: what does proficient mean for us? This is like asking students to define or talk about what makes "good writing," which is always a good discussion to have with them. But of course, being able to identify and articulate what good writing is and practicing such good writing are two different things. In part, they are different because it's not the writer who primarily determines what is good writing, it is the reader, which in most classrooms is the teacher. So even with a good use of a single point rubric, the assessment ecology of the classroom will still bend toward the teacher as standard maker and standard bearer. Students continue to play the "how do I please the teacher" game, or the "give her what she wants" game. To avoid this, I have used writing dimensions, instead of standards in my single point rubrics. So they are rubrics that do not identify "meets expectations" as much as they are ones that identify the dimensions of writing we are exploring and trying to understand. And the only way to do that understanding is to get observations (feedback) from multiple readers, colleagues and the teacher. 

So I shift the rubric from standards to dimensions. Rubric building activities I engage in with students ask this question: What dimensions of writing do we want to work on or improve? Notice, I'm changing the focus of the above rubrics (both) from describing what is proficient or meeting expectations to a dimension of writing that we can argue and disagree about. That we should and inevitably will disagree about, if we are human. Let's take that "Development" row in the Pittsburgh State rubric above. Gonzalez' single point rubric would use the "meets expectations" column as the defining point in one row of her rubric, that would then help her as the teacher write about her "concerns" and "evidence of exceeding standards" for that dimension. Again, notice that her rubric focuses on pleasing the teacher, and only acknowledges her judgments. 

In my single point rubric, the point identified is a dimension of writing that is not a standard but a dimension, a question about the writing, in a sense. So here's how their standard and my dimension might look next to each other: 

  • Their Standard: Evidence and reasoning are adequate to support claims. The assignment is complete. 
  • My Dimension: How does evidence and reasoning support claims adequately? How complete is the draft?

Notice that the dimension encourages readers (judges) to explain their observations and demands that multiple readers read and provide observations. It also does not assume that there is a standard by which we can judge or rank any dimension of writing. Sally's essay and Jose's are simple different instances of discourse, and so should be responded to on their own terms. That's what we focus on.  

What's the advantage to mine over other rubrics? I think, the biggest is that it doesn't penalize subaltern discourses, multilingual students, students of color, or working class students who come to our classrooms with discourses that do not match well with the academic ones that tend to be a part of the standards on all rubrics (except mine). In fact, it uses them to create discussion, disagreement, and productive dissonance in the reading of student writing. So does this mean that my dimension-based rubric is not teaching some standardized version of English? No, we cannot avoid that to some degree, but we can be more critical and conscious of that standardizing in our judgments and rubrics as only one perspective, one reading, which I believe a dimension-based single point rubric does. Ultimately, focusing on dimensions in our rubrics and not standards moves our classroom assessment ecologies toward antiracist ends. 

Saturday, July 11, 2015

Can Specifications Grading Be Used for Writing?

Recently, I read Linda B. Nilson's Specializations Grading: Restoring Rigor, Motivating Students, and Saving Faculty Time (2015). It is pitched to a broad set of university courses and faculty, and not directly about assessing or grading writing or writing courses; however, she does use several examples of assessing writing, and she includes writing courses in the larger purposes she has for the book. I don't intend this post to be a full review of the book, but because the book does speak directly to faculty who wish to improve the way they teach and evaluate writing in their courses, I wish to offer a limited response to the book, and an encouragement to go get the book and read it, despite a few reservations I have about some things in it.

Some Things Worth Looking Thinking About

There are several valuable things any writing teacher (or teacher who assigns and must assess writing in her courses) can take from this book. The first is her convincing argument in chapter 1 against the conventional grading system (the point-based, letter-based, or percentage-based systems used by most today). She shows how grading is only a recent historical phenomenon in universities, and provides some of the research that reveals the deep problems with grades and learning. While I don't agree completely with all of her criteria for evaluating grading systems at the end of this first chapter, many are good ones. Second, her argument for linking grades, either individual ones or course grades, to outcomes that define courses and programs is important. Chapter 2 offers some clear, simple ways to create outcomes, while chapter 3 provides ways to link outcomes to the grades we give students in courses. Lots of examples make clear various ways to implement her ideas. I think, these chapters would be most instructive to a wide range of teachers looking to revise their assessment practices around writing. While she doesn't say it directly, it's clear that Nilson sees good teaching centering on good assessment practices, which I have advocated for on numerous occasions over the last 10 years.

Nilson also provides a nice rundown of the research on motivation, including three of the most common theories studied in the research: goal orientation theory, self-determination theory, and goal-setting theory. She uses these theories in chapter 8 to explain how specs grading motivates students, or can motivate them to do good work.

The Heart: Spec Grading

The heart of her book is specifications grading (or specs grading), which is closely wedded to the outcomes of a course. At the risk of over-simplifying the system, Nilson argues that specs grading can reduce the kinds and number of judgments teachers make on assignments so that grades are fairer, students strive after learning, and teachers spend less time grading. Using a track and field metaphor, she offers two kinds of judgments that teachers might use to produce grades on assignments, which help define specs grading:
In operational terms, students receive grades based on the number of work requirements and/or the specific work requirements they complete at a satisfactory level by given due dates. In other words, students earn higher grades by jumping more hurdles that show evidence of more learning (i.e. mastery of a greater amount or breadth of knowledge or a greater number of skills on the same level) and/or jumping higher hurdles that show evidence of more advanced learning, or both. While the idea of more hurdles is a relatively simple matter of quantity of work, the notion of higher hurdles reflects the more complex matter of the greater cognitive sophistication or challenge of the work . . . (p. 25)
The key in specs grading are those binary judgments on assignments. In specs grading systems, the practice of grading writing, then, is binary: a student has done an assignment to a satisfactory level or not, so she gets credit or not. Course grades are calculated by adding up how many assignments are completed, or which ones are completed (if there are some outcomes/assignments that are harder to accomplish but necessary).

There are two kinds of judgments in
any grading system: those about
quantity and those about quality.
Despite the fact that she too casually disregards contract grading, and does not really look closely at how various versions can work very similar, if not exactly like, her own system, Nilson's spec grading has some strengths that any teacher should pay attention to. I've already mentioned one, linking one's grading of any writing assignment in a course to outcomes, since most schools use outcomes to maintain accreditation and assess their programs' effectiveness, but this can be a problem (as I explain below). Nilson also raises important questions about how our grading systems should do more than measure and rank students. The binary judgments resist ranking, helping students focus on more important aspects of the assessment, like the feedback a teacher gives that actually says things about the student's learning or writing. She also claims that specs grading can help students take control of their educations and gain agency and even intellectual curiosity (mostly be avoiding a focus on grades). These are hard things to do when your grading practice mostly measures and ranks through numbers and letters.

My Response: Questions of Fairness

While I find much to value in her system, I have one concern that the book never addresses. At every turn, Nilson ignores the very real concern of what it means to use outcomes to develop binary rubrics that then are used to measure the quality of the writing assignments of racially, linguistically, and culturally diverse students. Even if these judgments are binary (acceptable or not acceptable), they still hold A STANDARD, which is a problem when that standard is so closely linked to a white, middle class social and discursive formation. Assessing writing is particularly riddled with this problem. Nilson never addresses how her assessment system will help students of color or multilingual students, how her system, by only valuing and maintaining conventional outcomes, can be unfair to many students of color, multilingual students, and working class students. She never even mentions students by these designations, never uses, to my knowledge, the word "race," for instance. I'm sure she has good intentions -- it's clear throughout the book that she cares about students and teachers -- but I'm more concerned about the effects of her system on students of color and their mostly white teachers (at least in writing courses, but I doubt the demographics are much different in other departments).

What she's arguing is that every student is measured by the same standard, but in writing classrooms, or in the assessment of writing in any class, this means everyone is measured by a single standard, which usually is a white, middle class standard. She never questions the idea that outcomes funnel, not broaden, what teachers expect to get and can end up valuing in writing (to be fair, writing is not the only kind of course she's thinking about). The bias in the specs grading system might be stated in this way: when we evaluate student writing, we must expect something particular (the outcome), then give credit only for that kind of response. There is little tolerance for the unexpected. When one uses outcomes to grade or evaluate student writing, teachers have to imagine the ideal paper first, then articulate it (if they offer that to students), then expect and look for the outcome in actual student work. We read for that outcome, ignoring everything else. And often, everything else is read as deficient, lacking. The bottom line is: outcomes have a hard time anticipating diverse student writers, and in fact, usually punish them for out-of-the-box thinking and writing -- outcomes too easily punish students of color for NOT being white students with white discourses.

Students of Color Start Farther Back on the Race Track

This means many students of color, multilingual, and working class students with different discourses than the standardized ones we expect in writing assignments start farther back on the race track (to use her track and field metaphor). Specs graded students are not running the same race. Some must run 800 meters in the same time as those who run a 100 meter dash. Some must jump hurtles while others run a clear track. This calls into question the fairness of specs grading because specs grading doesn't account for who students are when they enter our courses, or what they already know and how what they bring might be just as good as what we initially expected. The system assumes that all students have equal access to the practices that lead to mastery of the outcomes in question, or simply disregards how many students do not come to us with the same writing competencies and practices.

Nilson does offer one section in chapter 6, "Fostering the High Achievement of At-Risk Students" (two pages), that seems to cover helping students of color. However, I cannot help but hear in her discussion a Kipling-esque, white-(wo)man's-burden, maternalistic stance toward students of color -- a kind of "I know what's best for these poor, unfortunate and misguided students" stance that she doesn't hold for "prepared" students, which I can only assume are white, middle class students. But there's more to the potential unfairness. Her maternalistic stance comes through in how she references students of color in the section, which reveals to me spec grading's inability to address racially and linguistically diverse students in writing classes. In the opening of the section, she says:

At the same time, we should encourage our students to strive for an A -- in particular, underprepared and first-generation college students and students from underrepresented groups. These individuals are likely to underestimate their abilities and may lack the confidence to aim for high achievement. Furthermore, some of them may find college a strange and disorienting place, one that presents decision points they do not fully understand. If they opt for a lower grade and we do nothing, we are unwittingly reinforcing their misconceived fears that success is beyond their reach and that they really do not belong in college. Telling them personally that you believe in them and their capabilities may buoy their ambitions enough to make a huge difference in their academic success (Gabriel, 2008). (p. 75)
Judging all student writing blindly (by
the same standard) is not inherently
fair when one accounts for who and
from where students come.
In one sense, she has it right. Many students of color are "underprepared" for college, because college is a white institution, so why would an African American male student from the South side of Chicago be prepared for a culturally, racially, and linguistically white place like a university? But the terms she uses to reference students of color or working class students elide their actual identities, elide the kinds of assessment and judgment issues teachers face when assessing their writing. It is not simply that they are not individually prepared. This would assume some, if not a lot, of personal responsibility for one's lack of preparedness. A student prepares for his studies, but realistically that South side, African American student IS PREPARED BY a segregated school system and by his parents and church and friends and neighborhood, by our society at large, by the images he has experienced on TV, the internet, and elsewhere. His discourse and stance toward society and the Standardized discourses of school are shaped in different ways that what I think any conventional outcome in a college writing classroom (or other classroom) may expect of him. Preparedness for college is not a simple issue, and it's not confined to the psychological, or even the pedagogical.

And so, fairness is not solved by individualized assessment practices, which Nilson seems to assume. It is also about the social, the contextual, and the structural. How do we address the fact that our outcomes for writing usually benefit -- privilege -- our white, middle class students most, giving them a shorter track to run than our students of color or working class students or multilingual students? As I see it, this question of fairness and social justice is about the nature of outcomes as white, something Nilson never questions.

Furthermore, Nilson primarily defines students of color (or the "underprepared") as lacking, as individuals, separated from the larger social structures of schools and society that might help make sense of their positions in school, and their academic successes or failures. Students of color are defined as fearful and misguided. Surely, there are students who fit this profile, and I'm not saying we not think about the individual psychological health of our students and how it relates to their learning. I'm saying these things are small aspects of the unfairness in grading systems in college writing courses. Simply patting a Latina student on the back and encouraging her is not going to make one's writing assessment practices fairer, nor will it help that student come to terms with the way assessment works on her in college and elsewhere. In fact, it tacitly blames her for her failures, for her "underpreparedness."

The strategy also doesn't question the system itself, the system that says only these outcomes matter, no matter who the students are. Specs grading, in this way, is flawed in that it has no way to be critical of assessment itself, or the values and outcomes promoted in any give assessment of student writing.

Final Words
Let me reiterate that I'm being a little unfair to Nilson, as her purpose is to speak to more than simply writing courses and assignments, but she does use a lot of writing assignment examples. In my estimation, it would be a mistake by any teacher to use specs grading to grade writing in any class without first thinking very carefully about how to make such a system fair for diverse students and critical toward the very outcomes we promote in our courses. Don't misunderstand me. The book is worth reading, and many teachers will get a lot from the book. It's smart in many ways and raise important questions about how we grade (and why) that all teachers from all disciplines should continually ask themselves. 


Asao B. Inoue

Director of University Writing
University of Washington Tacoma