Fewer grades & more feedback: The story of our year.

We made some changes this year. We massively reduced the frequency of summative grading and increased the amount of action orientated feedback. We used a move away from traditional % grading to a standards-based model to achieve this.

What is the theory behind the changes?

Our major change was never writing grades on student's work and using a 1-4 mastery scale. We took away the "live" grade book and spent less time grading and more time teaching. In the US system (for those reading in the UK) % grades are given every week with a certain % for HW, a certain % for class participation etc. The grades are often updated every week and of course, the top kids are always competing for that 100%. Sometimes they can even go beyond 100%. What's the problem?

To clarify what I mean by "grade" I mean something that "counts" - something with "stakes". Of course, if a kid got 7 answers right you can tell him he got them right. To not do so would be madness. The point is, how to get those remaining 3 right next time? The problem with "grades" is that quality feedback is an inexpensive and highly effective way of improving learning but grades aren't often good feedback. If you improve the quality of feedback, you improve learning. Dylan William, the guru of assessment for learning, makes the case for changing our perspective on grading. I would summarise this as the power of using assessment as a method for understanding what the kid knows and then teaching "responsively". This gets to the heart of assessment for learning. Added to this, I am going to mention two pieces of very widely known and commented upon research that informed our decision to focus on learning for the sake of learning.

1) The importance of self-definition and self-efficacy - what I believe about my own ability matters

2) The potentially negative impact of summative grading on learning

Students predictions about how well they are going to do are often accurate. If I identify as a 75% type kid, then I will most likely waver around the 75% mark. Here we have a classic chicken and egg situation. I have looked at grades in a number of schools that use a LOT of summative information and it turns out that grades in 2nd or 3rd grade are generally pretty good indicators of grades in 10th and 11th grade. Unsurprising, perhaps, but the important thing is that "grades" "predict" as well as "reflect" how good a kid is at school. The causality goes both ways, the kid causes their grade but also the grade causes the grade. If you are told you are a low achiever, that's hard to shake off. It's hard to become something different. For high achievers - someone tells me I am good at school, that motivates me, I put more effort in, I do better. But the idea that grades reflect *just* hard work and effort is false. Being told (because of low liquid intelligence or WM capacity) that I am a bad student will lead to less effort and lower achievement. This is one major reason for dropping the near permanent reminders of somebody's "position" in relation to somebody else - the % and live grade book - and replacing it with something more useful and meaningful, qualitative, action-orientated feedback (often delivered to the whole class).

The next reason for the change is that summative grades aren't a good form of feedback. To paraphrase Dylan W again, a % grade contains no information about how to improve. It might include an unsubtle message of "you did well or badly" but it doesn't in itself give me the information I need to perform better on a similar task in the future and will by its nature involve the ego and sense of who I am as a person and will either reinforce (or not) my self-efficacy as a student. We took the opportunity to reduce the quantity of these messages.

Furthermore, when receiving a % grade that counts as part of feedback, the focus of the student is "how well they did". Considering that feedback to students is one thing almost all research agrees is incredibly important, why would we risk the power of this feedback by grading it? In the moment of returning work to students, what they are thinking about is the most crucial thing. If they are fixated on grades, less learning will occur. Another one of William's principles - emphasised by Tom Sherringham - is that feedback should be more work for the receiver than the "giver". In other words, good feedback is all about actions which produce more learning. If I am thinking about how well I did, those actions will be less fruitful and produce less learning.

We should also consider motivation. If we award grades based on a range of assessment evidence, not one performance, what will motivate students to put the effort in to each piece of work? If I know that this won't "count" and I have another opportunity, why would I bother giving it my best shot in the first place? Herein lies the challenge. If the purpose of assessment is to see how much a kid knows and teach accordingly then I want this assessment to be accurate. One way of assessment losing accuracy is if the kid crams right before the test then forgets the next day. So they might have put in a load of effort and the % might reflect this and they "performed" but I cannot be sure that this knowledge is actually built into an interconnected schema in LTM.

Under a mastery model, assessment is all about understanding where everyone is at in order to teach them the things they don't know. In some ways then, assessments in class - quizzes and the like - should be unannounced. This gives a fairer understanding of effort in class and the learning that has taken place. What matters is what they know and are able to do as a result of your teaching not how hard they studied at home. My strong assertion is that the amount of effort isn't actually that important. If the kid knows he'll write the answer, if he doesn't he'll try to get it right and give up when he realises he cannot do it. I don't think that knowing there is a grade attached to work would significantly increase effort. In fact, I could even make an argument that the amount of anxiety caused by having to perform for a grade would, in fact, reduce the quality of the response.

There is also a philosophical problem with GPA. GPA is calculated throughout the year in an aggregated way. It should reflect objectively how much knowledge has been acquired as the year goes on. Here's the thing, say a kid has 75% from the first quarter, we see he has a couple of 0s for missed homework. How can we know if the knowledge and skill are still "there" at the end of the year? A solid curriculum should interleave, we should return to fundamental concepts, ideas and skills as the year progresses. Fractions are not "quarter 1 and forget about them". So if I got 90% in Quarter 1 on fractions then I get 98% on fractions at the end of the year, why is it fair or accurate to "average" those two grades? The purpose of mastery is that we take the temperature of learning at a certain point not that we consider it "done" because the kids studied for an assessment and performed.

What changes did we actually make to overcome these problems?

Here's what we did. Firstly we moved to a 1 - 4 scale (developing to mastery). We then sat down and prioritised all the things we wanted students to learn over the course of the year and expressed those as objectives. These became our "power standards" the crucial, foundational knowledge and skills we wanted ALL students to have completely mastered before the year was out. These standards were articulated in each subject and became the foundation of our report cards.

I was very explicit and told teachers not to write numbers 1 - 4 on pieces of work at any point. This is because assessments are evidence of progress against the standard. You cannot tell if someone has mastered any standard by looking at any one piece of work. The 1 - 4 should be judged using a rubric and reported at the end of each quarter based on a range of different assessment evidence. The student's performance - their grade - should be inferred from several pieces of evidence and the information should be given to each kid before the report card goes home. Naturally, as students reach the end of a quarter they will have "bigger" assessments which contain more of the information used to derive the summative judgement. However, all of this information should also be used as information to guide further teaching. All assessments, including the bigger ones, are formative. There is no expected "consistent" format for either the number or type of assessments given by the teacher. Students should be assessed throughout the quarter and then teaching should happen responsively based on the evidence. The idea that you can "fail" is almost completely eliminated. This makes sense as we know that holding kids back for a year almost always damages learning.

We ran into a problem - if we are prioritising a standard and interleaving the knowledge required to master that standard throughout the year (think about the standard as a description of one important part of a domain), how can we stay a student has "mastered" when we are only a quarter of the way through the year? In order to have mastered something - to be able to do it independently without help - then we need to return to the knowledge and skills throughout the year. So we invented an "expectation" number. it ends up looking like the below... the right-hand column is the teachers expected level of mastery at each stage.

I can feel some of you shudder thinking WHAT... that's so many numbers... it is. However, the end of quarter assessments can include 3 or 4 of these standards simultaneously. There are rubrics for each standard and they express what a student must know and be able to do for mastery. The 4 - mastery - is the "expected" level at the end of the year (or before if the teacher has finished teacher than standard). Rubrics are used to define the grade reported in the grade books but never the feedback given to the kid. Daisy Christodoulou gives a thorough dismantling of Rubrics for formative purposes in "making good progress".

The rubric we use contains details of what kids need to know for Mastery, proficiency, developing and emerging. So there are only 4 "data drops" spaced throughout the year. That's four "intensive" periods of grading a year. US schools normally grade almost weekly.

So how did we avoid the problem of kids getting "lower" grades at the beginning of the year and then doing better later on? In the US students are expected to have a high GPA throughout in order to get access to the top schools. We solve this by calculating the grade with respect to the teacher expectation. So if the teacher expects a 3 and the kid gets a 4 that equals 100% and if the teacher expects a 3 and the kid gets a 3 that is 90%. There are two pieces on information contained in each quarterly grade, one is the overall level of mastery shown and the other is how that relates to the expectation of the teacher at that particular time.

As well as removing the live grade book and only reporting in this way four times a year we had two parent-teacher conferences. These occurred after the first and after the third quarter. In these conferences, the advisory teacher collected information about each child from other teachers and then sat down together with the parent and student with the grade book and talked about strengths and weaknesses and made plans for the rest of the year. This was so important that students didn't attend classes on those days.

Accompanying this we have spoken a lot about the importance of knowing, about learning for learning's sake and about why the acquisition of knowledge and skill through education is such a valuable, noble, challenging and important endeavour.

What was the immediate reaction?

The first reaction was one of shock, horror and disbelief. One kid - high achiever, amazing kid - said it was like constantly being given candy (% grade) and then being told one day the candy has run out. The % grade enabled the higher achieving students to compete against one another for the very top grades. They would seek out that perfect 100% score, asking for extra credit and doing everything - within their honourable means of course - to achieve it. The parents were lost without the live grade book. Because the results of the in-quarter assessments didn't "count" towards a grade, some parents and kids believed they were putting less effort in.

To deal with less "accountability" we have (re)introduced centralised academic detentions. If a kid doesn't complete important class or homework on two or more occasions then they are held after-school to complete the work. I don't expect teachers to supervise this, just send me names. In short, without the accountability of the % grade and the 0 then we use another form of accountability. I should point out that I think it would be disingenuous to suggest that under a % grade system, everyone does their homework and completes all classwork. There will be some who - for whatever reason - don't "care" about their grade under a percentage system and there will be some who sometimes don't work when there is no grade involved.

There was a worry that this was less academically rigorous, that because students were less "concerned" by the grade they were relaxing. In high school, they will be constantly evaluated with a percentage grade; they will have to act on the feedback autonomously without so much direction and guidance. They will be expected to do a lot of studying at home for those tests. Are we preparing them?

I took a huge risk doing this. I was worried. What if the kids really are less motivated? What if they are learning less? My reaction was to do the following. I met with some reps of the parents and some other parents, as we normally do each month, and I explained the major point. The percentage grade isn't a good form of feedback. Good quality feedback that involves action on the part of the student means they will learn more. I believed that four times a year was enough to get a good idea of how each kid was doing and act when we needed to act.

Essentially I made two assertions. The first was that using our system the grades the school generated would increase and the gap between the highest achievers and the lowest achievers would narrow. The bit about the lowest and the highest would obviously happen. When we are using only a 1 - 4 scale and we are saying that for the first half of the year the maximum a kid is going to get is around a 2 or a 3 then, of course, the gap between the highest and the lowest will be decreased. The question is then, okay, but does this improve achievement overall and how will we know?

Firstly, the average % grade over the last few years has always been around 90% in all three grade levels of middle school. I suggested that if we use the scale spoken about earlier where exceeding expectation = 100% and meeting = 90% we would see higher achievement. Again, the answer is okay, but how do I know you aren't just making it easier to do well? How can we hold ourselves to the same high rigorous and challenging academic standards as we have done previously? The answer we have are our MAP results. I know that no standardised test is perfect but the MAP is a good indicator in Maths and ELA. If the results were as good as in previous years, the gap between the highest and the lowest goes down and the percentage grade goes up, then we can say it was a success. If, however, the MAP scores fell significantly beyond the normal variation over the years, we could agree that the arguments about trying less hard due to the lack of a grade were accurate. Fingernail biting, squeaky-bum time.

What happened in the end?

Well - I just got the MAP results back. The reason I want to share them is to guide other schools thinking about moving away from the traditional percentage rade to another system like SBG reassure them that it can boost student achievement, reduce stress, maintain accountability, reduce teacher workload (no constant grading) and actually close the gap between the higher and lower achievers. This will happen only if the quality of the feedback improves. Another reminder, rubrics are an awful way to give feedback to kids. Information on how to improve should be qualitative and involve actions the kid can take not descriptions of his performance.

In terms of relation to the US norm, in 7th and 8th grade we got the best MAP results the school has ever had, since doing MAP, in everything. Remember this is a school where the academic bar is already high. We are used to scoring significantly above the US average. MAP also gives us an indicator of how much growth took place - they test kids twice, once at the beginning of the year and once at the end - we grew, on average, more than the school had ever grown before. Seeing this I felt I was able to answer the question conclusively, even though they have not had a percentage grade, moving to a focus on learning, feedback, intrinsic motivation and mastery has better-prepared students for HS.

Interestingly, in 6th grade, the results were more mixed. When I did the analysis I noticed that when kids move into the middle school - either in 5th grade or 6th - the results tend to drop a bit. They remain way above the US norms and the difference isn't significant but no generation over the last 7 years has ever had the first year of middle school as their "best" year.

Growth:

Here is the breakdown of the raw scores overall by grade level. Red indicates the year in which each generation scored the highest above the US Norm.

Objectively, this was an outstanding year of student achievement. Of course, correlation doesn't equal causation as my research orientated colleagues would say. That's true. However, it would be quite unusual if the major thing that changed - focusing on quality feedback and not grades - lowered achievement and some other factor caused this year to be a good year. We also have to consider sixth grade, why didn't we achieve the same success with that generation as we did with the others? I do think that the "change of school" factor comes into it as I check the 6th-grade scores from previous years but only time will tell. The jury is definitely still out but I am pleased to note that we are on the right track. There are few things that really are important and helping kids master maths and ELA are two of them.

When we look at the actual picture what it seems like we have been able to do is CLOSE the bell curve (reduce the gap between the highest and the lowest achievers) and move everybody upwards.

I would be very interested to hear reaction and feedback! One final point, although I cannot verify this, I suspect that students were less stressed. One argument for this is that parents complaints were that they were unchallenged, unmotivated and unworried about their schoolwork without the traditional % grade. This leads me to suspect that the kids were less stressed (unless they were stressed about not being stressed enough, which is a possibility). Despite this, they did very well. could it be that the lack of stress is itself a good way of learning more? It's a pretty well-understood fact that the human brain works better when it's happier. Some stress is good, it drives you onwards and stimulates growth but beyond that, placidity, calm, serenity, these things can only help you learn more.

Oh and another huge bonus of this concerns children who receive student support. We have been used, under a percentage grade system, to focus on giving "adjustments" such as extra time or even extra help. If there is no grade, this isn't so important and we don't have to be interested in making the final grade "accurate" viz a viz the purported need. Yes, of course, we could give extra time to finish - hey we could give everyone extra time if we really want to see what they know and we could suggest that they only have to get up to a certain point. However, the important thing is that we are interested in a genuine, unadulterated idea of what they know and can do and are not interested in the types of manipulations which would be needed to increase performance. To put it another way, if someone gets 4/18 then because the grade doesn't count there is no need to say, oh, they actually got 4/10. Reasonable adjustments become a lot easier. Not to mention the fact that certain kids aren't constantly reminded of their position in the pile. It should also be noted, however, that the distribution of the MAP scores remained the same as previous years. The kids at the top did as well as the kids at the bottom overall. The gap between them was not reduced using the standardised test but it was reduced in the school-given assessment results.

Comments welcome!

Research & Practice

Search This Blog

Fewer grades & more feedback: The story of our year.

Comments

Post a Comment