Watching the Watchmen – an update

March 31, 2014

On the Monday before last we published our report on Ofsted reform, Watching the Watchmen. I was really pleased with the number of people who have taken the time to read it, comment on it, and challenge some of the proposals within it. This publication was then followed by HMCI’s announcement at an ASCL conference that Ofsted are indeed moving to a very similar two stage inspection process – though sadly, still including lesson observations in routine inspections. Below, I sketch out our response to some of the questions raised on our report and a brief reflection on where Ofsted stands now post HMCI’s speech.

Responses to our report

1. Why have you still recommended having lesson observations in longer inspections?

A lot of people were surprised as to why the report recommended keeping some form of lesson observation. It was variously described as a “fudge” or “contentious”. Let me explain the thinking in more detail.

It’s worth noting that there are three big changes between observations now (and in our new Short Inspections) and the context in which lesson observations should take place in the future.

Firstly, the lesson observations would only be taking place in schools where the Short Inspection – without observations – had either raised a cause for concern or had proven inconclusive; around 20% of schools. This doesn’t necessarily mean that in any schools undergoing a Tailored Inspection, there is a potential issue with Quality of Teaching  (for example, a Tailored Inspection may be called because of concerns over leadership, or poor financial performance, or bad behaviour) – and in these cases, observation may play a much smaller role, at the inspectorate’s discretion. But where a Short Inspection has raised an issue of teaching, and where Ofsted have played their role of validating or otherwise the school’s judgement on its teaching, and found reasons to investigate further, it seems reasonable in this instance to need some form of further independent scrutiny of teaching.

Secondly, and importantly, the inspectors undergoing the observations need to be much more highly trained. One of our supporting recommendations was that we think Ofsted should commission a UK academic or groups of academics to design a programme to get us to the levels of validity and reliability at the top end of what MET found could be. If we could get inspectors to that level, it still wouldn’t be perfect – and we wouldn’t want individual lessons graded or even an Evidence Form making judgements on the basis of an individual lesson – but it would be a big step further forward

And thirdly, the lesson observations would be taking place in the context of a longer inspection, with more specialised inspectors by phase or by subject. So if the Short Inspection and data suggests a problem with maths in a primary school, for example, then the Tailored Inspection would always include a maths specialist, as well as inspectors from primary phase, who could observe more frequently than now, and for a longer duration. All of these factors would raise the reliability and validity of a new observation protocol.

I wholeheartedly endorse the principle that for the vast majority of schools, Ofsted’s role should be to understand whether a Headteacher and their leadership team understand their school and their teaching, and to validate that. For routine inspections, or what we called Short Inspections, that is why lesson observations should be entirely abolished (and it’s worth noting that this step takes us away from others in this debate). But for schools where an issue has been identified, I think there remains a need for some, reformed, observation.

2. Why have you proposed keeping a separate grade for Quality of Teaching and not merge it into Achievement?

The first thing to say is that for the vast majority of schools, this is exactly what we suggest should happen. Our proposed report card following the Short Inspection would have two grades – one for overall performance, and a new one for ‘school capability’ which explicitly merges Achievement, Quality of Teaching, Behaviour and Safety and Leadership and Management. This is because it is reasonable to take all of these in the round when Ofsted’s role is simply to validate a school’s judgement. Ofsted make two judgements – is the school Good or better, and is the school’s capability to keep it going also Good or Better?

For the Tailored Inspection, I think it is reasonable to keep the grades separate. This is again, on the basis of a longer inspection. So it is reasonable to assume that inspectors can make more nuanced judgements across all four subdomains. More importantly, perhaps, keeping the grades separate allows for the possibility of good teaching being recognised.  Consider the case of a school where the data shows that the pupils are underperforming, but a new Headteacher or a new teaching staff is turning the school around. A combined Achievement / Teaching grade would almost have to simply repeat the judgement that Achievement currently Requires Improvement. Holding the grades separate allows for the possibility of explicitly recognising and welcoming Good teaching, that will (in time) hopefully make the school Good.

3. Why did you recommend keep the Outstanding grade?

The last question raised was why we kept the Outstanding grade, or the 4 level grading system more generally, as opposed to simply giving schools a ‘pass / fail’ or ‘proficient / not proficient’ grade.

Again, it is worth noting that the 2 stage process will in some ways give us that. If a school is rated Good or better they only get a Short Inspection; if there is a risk they are RI or worse then they get a longer inspection from a team of specialists. In practice, this may come to be seen as a ‘proficient / not proficient split’. (Personally, I think a stronger critique of our system is precisely this move towards a de facto pass / fail system where the very fact of having a Tailored Inspection will be seen as an admission of failure. We discuss ways of ameliorating that, including making explicit that schools undergoing a Tailored Inspection can still receive an Outstanding or a Good judgement – a Tailored Inspection is not a limiting judgement in any way).

But the reason we kept the 4 level grading system is to give us nuance in the system. If 8 out of 10 schools are Good or better, then we are talking about 15,000-16,000schools. It is reasonable to want to be able to distinguish within that very large cohort – to know who are really the truly, shall we say, outstanding schools. A category which doesn’t give parents, teachers, pupils and policymakers that sense of granularity I think is not acceptable.

Wilshaw speech

I was pleased to be in the audience at an ASCL national conference last weekend when Michael Wilshaw set out Ofsted’s thinking, and was naturally pleased with the direction of travel they are taking – as indeed they have been discussing for a while, including before our publication. If you haven’t done, I would recommend reading the whole speech. I look forward to working with Ofsted as part of the consultation process. There were however two issues I wanted to flag some concern with

Firstly, Michael Wilshaw said

Can I turn now to the perennial complaint of inconsistent inspection? I won’t deny that any system based partly on human subjectivity is fallible. But I have to say to you that there is little evidence to suggest that the number of misjudgements has increased – on the contrary.

The latest figures show that 91% of schools are satisfied with the outcome of their inspection – a proportion that hasn’t significantly altered in years. In a survey of 850 schools that have been inspected in the last four months, almost 85% of them believed that the inspection process had helped them to improve. Indeed, the number of complaints about inspection outcomes actually went down last year.

Firstly, of course, it is worth noting that schools reporting satisfaction, and complaint levels, does not negate a charge of inconsistency in the system. Schools may feel satisfied with an inspection despite it having made an inaccurate or unreliable judgement. And our Headteachers roundtable spoke about the common received wisdom that it was pointless complaining to Ofsted because of the backlog of complaints and the fact that schools never ‘win’. I have no idea if this is true or not, but it might explain some of these figures.

The data I’d really like to see to look at inspection quality and reliability – and that is necessary to really understand the quality of the process –  was some of the proposals in our paper. How many times, of the nearly 8,000 inspections last year, did Ofsted centrally have to amend a draft report from an inspector because their judgements were un-evidenced or poorly written? How many times did senior HMIs have to go out to schools to conduct a second judgement? Does this have a particular pattern (i.e. are primary schools more likely to need to be double checked than secondary schools? Are Outstanding school judgements more or less reliable than RI judgements?). And if there are errors, how many? Are there particular inspectors or particular contractors who are more error prone than the others? And if so, what’s happening as a result of that?  We need far, far more open data and transparency in the system to ensure the confidence of schools of Ofsted judgements.

Secondly, HMCI also said in the Q+A (and this is my transcript rather than his official remarks, taken from the video here)

I know assessment will be on all your minds…and I assume that until 2015 most of you will be using levels still, which will make our lives much, much easier. What I don’t want are HMI spending most of the first day figuring out what assessment system a school has. That’s not what we’re about. Whatever happens, you’ve got to make sure you’re very clear to inspectors how you track progress and what criteria you’re using

This response troubled me greatly, and indeed, I raised a question in the Q+A session on it. I slightly misheard Wilshaw originally – I thought he was talking about post 2015 and effectively ad infinitum – but my broad point still holds. Because this type of response, that effectively says “inspectors don’t want to spend time looking at your systems, we need to see something we get so we can get on and inspect you” is precisely – precisely – the problem with the power relationship between schools and Ofsted now. Schools should design their own systems which work for their pupils and their parents, and they should use them. It is for Ofsted to understand how a school uses a system, and work within that. If they haven’t seen it before, and it takes a bit of understanding, then frankly that’s just tough. Schools must lead this process. And what worries me more is that there will be so many schools who read or hear these remarks and think “ah, Ofsted wants us to keep using levels”. And so they will keep using some form of them, especially if they worry that a new system might be misinterpreted and Ofsted will penalise them. This Chinese whisper approach around assessment over the next couple of years really has the potential to be another “preferred teaching style” kerfuffle, and Ofsted must realise the influence they have when they make such pronouncements, deliberately or accidentally – and avoid any perception that they have a preferred assessment style (or preferred any other style)

Join our mailing list