# Leaving Cert Standardisation



## Duke of Marmalade (8 Sep 2020)

I attach the report on Leaving Cert standardisation.  It is not bedtime reading.  But it does give some insight into the algorithm (they don't call it that).  Appendix G gives the technical details.  It is really wonkish, I suspect deliberately so; it would be a brave soul who would challenge the math.

There follows my broad understanding of the approach, but I would welcome clarification/correction.

There are basically two inputs:
(1) The Teachers' Assessments
(2) Predictions from Junior cycle for that school and subject based on a regressed fitting of past Leaving Cert results to past Junior cycle performance. (Note: regression is based on results from all schools and subjects, and includes other predictors but the Junior Cert results correlation with Leaving Cert results dominates).

These provide two probability distributions for each school and subject combination (cell).
The distribution used for standardisation is a mixture of the two.  The proportion of that mixture which is Junior Cycle Prediction is decided by how many students are in that cell. The formula for deriving that mixed distribution is where it gets really wonkish but it gives the example that if there were only 6 in the cell no credibility would be given to Junior Cycle Prediction (i.e. the Teachers' Assessments would be accepted without adjustment) and vice versa, if the cell was large a greater credibility would be given to the Junior Cycle Prediction.  Unfortunately we are not told the limit of this credibility but I doubt it exceeds 50%.  Having combined the distributions in this way the students are fitted into this standardised distribution (for that cell) based on the marks given by the Teachers' Assessment.  So if someone is middle in the class they would be given the mid point of this standardised distribution. * Note that the individual's own Junior Cycle performance has no role whatever in his/her final mark*.  The Junior Cycle Prediction is applied at the school level.

Now it is recognised that Teachers assessments were over optimistic and I assume the Junior Cycle Prediction was quite accurate in producing the historical averages.  So the less credibility given to the Junior Cycle Prediction the better the chances of enjoying the optimism of the Teachers' Assessment.
These are some examples:
Maths
21,552 sitting
Historical Average H1s 5.8%
Teachers' Assessment 11.6%
Standardised 8.4%
So that sort of stacks up with an average 50% credibility being given to the maths Junior Cycle Prediction

Arabic
155 sitting
Historical Average H1s 17.1%
Teachers' Assessment 34.8%
Standardised 34.8%
Suggesting no credibility given to Junior Cycle Prediction

Latin
48 sitting
Historical Average H1s 19.2%
Teachers' Assessment 43.8%
Standardised 41.7%

Overall, I think any ambulance chasers will find it difficult to pick holes in this, though there may be a role for math expert witnesses


----------



## Colm Fagan (8 Sep 2020)

Hi Duke

I haven't looked at the documentation yet, but it might be worth checking if the standardisation process could have caused what appears to be a freak result with an Irish-German school (got from the RTE website):

"_*A Dublin school which places a focus on the teaching of German has expressed shock at lower than expected marks awarded to its students in this year's Leaving Cert Calculated Grades process.*_
*
14% of Leaving Certificate students at St Kilian's Deutsche Schule received H1 grades in German this year, compared to 41% last year.
*
_*The schools said it expected around half of its students to receive a H1."*_


----------



## Duke of Marmalade (8 Sep 2020)

Colm Fagan said:


> Hi Duke
> 
> I haven't looked at the documentation yet, but it might be worth checking if the standardisation process could have caused what appears to be a freak result with an Irish-German school (got from the RTE website):
> 
> ...


That is strange.  The national position for German was as follows:
6,772 sitting
5.7% get H1s historically
Teachers' Assessment 13.9%
Standardised result  9.0%

If St Killians expected 50% H1s, I interpret that as the Teachers' Assessment.
So they must have got a very significant adjustment from the Junior Cycle Prediction.
But that would suggest that the Junior Cycle Prediction was way below the historic result of 41%, in fact way below the final result of 14%.  
Definitely something not hanging together in this picture.


----------



## Duke of Marmalade (8 Sep 2020)

Open letter from principal of German school said:
			
		

> To whom it may concern,
> 
> I wish, on behalf of the LC Students of 2020, to signal our deep concern at the very flawed process applied to the calculation of grades in German at Higher level for this school. The results we have received bear no relationship to the ability, past performance or calculated mark awarded to the students from this school. I wish to have answered to the following questions.
> 
> ...



She's got a point.  Nasty swipes at the Gaelscoileanna and our bending to the UK examp_le._


----------



## Duke of Marmalade (8 Sep 2020)

I see this thread has been promoted from the depths.  I understand that it is more important than the depths but unfortunately it will probably get less attention for that promotion. Any chance of a demotion?


----------



## SPC100 (8 Sep 2020)

My guess: Their national model which predicts performance in leaving cert german, given junior cert outcome (in all subjects) doesn't have a way to understand the advantage this cohort of students have in german.  So they are strongly pulled towards national distribution.

Edit: Note that their achieved percentage of h1 in german is very close to the national teachers assessment of h1s (14%)


----------



## SPC100 (8 Sep 2020)

I wonder how exam focussed school like the institute will have fared. Based on my skim of this they will be (unfairly?) pulled back towards national distribution. 

(I'm assuming that they did outperform in the past, as their marketing/reputation implies)


----------



## SPC100 (8 Sep 2020)

Duke I think a key thing underspecified in your overview of 2) is that the regression is based on national past performance, and not pupils who have attended that school.


----------



## Itchy (8 Sep 2020)

A quick look at the guide DoM posted. Outliers are set aside in assessing the Teacher estimates. There were no outliers in this class as the teachers had a large cluster of high grade students. So when they were put in rank order, the lowest was pulled right down. The outliers are then brought back in to the mix and the gaps between the highest/lowest outlier and the non-outliers were further adjusted. So depending on the performance of the highest and lowest outliers, they would have compressed the grades of the non-outliers group. In reality, these student were the outliers in the national sense. 

Isnt there a French Lycee also. I wonder did they have the same experience. Far more data in Irish than in German so probably not best to compare to the Gaelscoileanna.


----------



## SPC100 (8 Sep 2020)

When I skimmed it was unclear if the school had to note students as outliers or if the department identified them as such based on data.


----------



## Duke of Marmalade (9 Sep 2020)

SPC100 said:


> Duke I think a key thing underspecified in your overview of 2) is that the regression is based on national past performance, and not pupils who have attended that school.


Yes, good spot I have edited my post.  It is clear that the Junior Cycle Prediction has gone seriously awry for St Kilians probably because it is so ahead of the average. This could expose the hubris in the model yet.


----------



## Duke of Marmalade (9 Sep 2020)

Itchy said:


> A quick look at the guide DoM posted. Outliers are set aside in assessing the Teacher estimates. There were no outliers in this class as the teachers had a large cluster of high grade students. So when they were put in rank order, the lowest was pulled right down. The outliers are then brought back in to the mix and the gaps between the highest/lowest outlier and the non-outliers were further adjusted. So depending on the performance of the highest and lowest outliers, they would have compressed the grades of the non-outliers group. In reality, these student were the outliers in the national sense.
> 
> Isnt there a French Lycee also. I wonder did they have the same experience. Far more data in Irish than in German so probably not best to compare to the Gaelscoileanna.


Not sure that outliers explains the St Kilians syndrome.  The fact is that the proportion of H1s will be an interpolation of Teachers Assessment and the modelled Junior Cycle Prediction.  The Teachers Assessment was 50% so the contribution from JCP must have been hugely less than that to bring the answer down to 14%.  
A possible interpolation would be:
JCP result 6% (historical national average) JCP/TA weighting 4:1
That just can’t be right


----------



## SPC100 (9 Sep 2020)

For this school it looks like it is a high achieving school.

Comparing past results on their blog to National points. It looks like in the past they have 6x more students who get 600 and greater (10p.c vs 1.4) and 3x (39p.c. vs 13) more students who get 500 or greater compared to National distribution.

The german unfairness is very clear to see, as we can identify a trait that predicts higher results. And see the algorithm was blind to this.

But, There is growing noise that this may be more widespread, and it is the high performing schools that are losing out.

Might be we over corrected based on the uk experience.


----------



## Duke of Marmalade (9 Sep 2020)

You would have thought that the results from this overly complicated model would have been thoroughly desk checked. 
Surely "Historical 41%, Teachers Assessment 50%, Standardised 14%" would jump right off the page".


----------



## Sadim (9 Sep 2020)

Duke of Marmalade said:


> That is strange.  The national position for German was as follows:
> 6,772 sitting
> 5.7% get H1s historically
> Teachers' Assessment 13.9%
> ...


Saw them interviewed on the TV yesterday. Most of them are German natives or first generation Germans whose first tongue at home was in German!


----------



## Sadim (9 Sep 2020)

SPC100 said:


> For this school it looks like it is a high achieving school.
> 
> Comparing past results on their blog to National points. It looks like in the past they have 6x more students who get 600 and greater (10p.c vs 1.4) and 3x (39p.c. vs 13) more students who get 500 or greater compared to National distribution.
> 
> ...


Certainly my daughter was at a high achieving school and it worked out badly for her. Principal said to me informally before Christmas they had a target for her of 550-560, she got 520. By the way, it was not a fee paying school.


----------



## SPC100 (9 Sep 2020)

Duke of Marmalade said:


> You would have thought that the results from this overly complicated model would have been thoroughly desk checked.
> Surely "Historical 41%, Teachers Assessment 50%, Standardised 14%" would jump right off the page".


I agree.

Like all good ideas when you see them written down, we say Of course, they should have done that. It's so obvious.

But that doesn't mean it was figured out in advance.


----------



## Duke of Marmalade (9 Sep 2020)

I think I now understand how St Kilians were cheated, and they were cheated.
The Junior Cert prediction was a composite of Irish, English, Maths and the next best two subjects per student,
Thus the fact that St Kilians are superb at German was greatly diluted and so their prediction was only slightly above the average and this was applied to all subjects including German.  They got very minimal recognition for the fact that they are especially good at German. 
In most cases this would not matter, a school gets good or bad results across the board.
This would not have happened if the correct procedure of standardising according to school historical results had been applied, but we were so keen to avoid the UK debacle that we abandoned the correct approach and St Kilians are big losers as a result.


----------



## SPC100 (9 Sep 2020)

Duke of Marmalade said:


> You would have thought that the results from this overly complicated model would have been thoroughly desk checked.
> Surely "Historical 41%, Teachers Assessment 50%, Standardised 14%" would jump right off the page".


There are something like 900 schools and maybe 20 subjects, so maybe 20k tuples too look at. But a quick sort by max delta would rise the problematic ones. 

Logical possible conclusions
1. Ineptitude - they never examined the data like this (seems unlikely given this is a simple test and the backstory, and all schools will be looking at how much their grades changed)
2. Assuming they did actually do this review, there were two many cases to manually investigate, or there were many discrepancies bigger than this one, so they never got this far.
3. They decided these discrepancies were better than the ones witch included schools historic marks.


----------



## SPC100 (9 Sep 2020)

The Irish english maths and two best world appear to limit over performance/reduce dynamic range e.g. a 10a1 jc student would have the same uplifting effect on their cohort as a 5a student. But clearly a cohort of students with all As in jc would be expected to outperform one with half as many As.

Also reliance on Irish might pull back schools like this one where presumably many students don't do Irish, or those that do might have less exposure to it, due to foreign background.

Also apparently some 'high point chasing' students strategically choose ordinary Irish knowing they will focus more on their top 6 and Irish will have no effect on their points. If this happens in some cohorts, This would likely result in the model predicting lower than expected outcome.


----------



## Duke of Marmalade (10 Sep 2020)

@SPC100 those are interesting speculations.
Using school historics became a political no-no after the UK debacle.  We came up with a clever way round that.  We would use the school historics but based on the Junior Cert achievement of the current year.  Thus dealing with the criticism that this year's students were simply getting what past year students in the same school got.
But the chosen algorithm has not worked in the case of St Kilians for reasons which I am beginning to understand, and maybe it is a unique exception. The idea of using Junior Cert results was good but it should have been used to determine how much this year's cohort compares with the previous three years' cohorts i.e. to validate the use of school historics as originally proposed.


----------



## Duke of Marmalade (10 Sep 2020)

Irish Times report said:
			
		

> However, Government sources say an analysis of the calculated grades issued overall shows no bias in the extent of the increase in marks by type of school.
> 
> On the issue of St Kilian’s, one well-placed source said schools should look at their achievement across all subjects and not a selected few.
> 
> ...


So it is exactly how we have worked it out. 
To try and explain what happened I will use plausible numbers.
Let's say Kilians were 100% of normal in Junior Cert English, Irish, Maths and, say, French but they were 200% in German.  Their composite score would then be 120% ((4 x 100 + 200)/5).  So they are assumed to do 120% better than average in *all *subjects at Leaving Cert.  Clearly that gives an unjustified advantage in all subjects except German but for German it gives a massively unjustified disadvantage.  Now given the high level of gearing in translating marks into grades it is almost certain that in terms of Grades/Points that overall Kilians were badly cheated. 
Maybe overall it did balance out for Kilians but I very much doubt it.


----------



## Duke of Marmalade (10 Sep 2020)

And as for the grind schools they probably have a point as well.
If Junior Cert predictions are understating the historic achievement of grind schools then they are being unfairly standardised.
After all we must assume that they are worth at least some of their fees and they do get their students to over achieve compared to their Junior Cert results.  Indeed it may even have been poor Jumior Cert results that led the parents to do something about it.
It is becoming clearer and clearer to me that school historic results was the correct standardisation process.  Junior Cert prediction might have had a role in deciding whether this year's cohort were inherently better or worse than historic cohorts and that could have been used to adjust the historic performance - but adjusted or otherwise it is historic performance that should have been used.


----------



## SPC100 (10 Sep 2020)

If it is a true statement that the school a student attends affects their results then the lack of modeling for a school effect will of course reduce accuracy.

The current model is based on averaging in all national past achievements between jc and lc.


----------



## SPC100 (10 Sep 2020)

Duke Using your working example, wouldn't we expect Irish schools to have an unfair advantage?

Edit:Assuming their english and maths is like everyone else.


----------



## SPC100 (10 Sep 2020)

Re 200pc in german/1.2 multiplier in your example - are you sure the models input is subject aware. Iirc the tech docs said that every additional subject made modeling too hard (especially as many subjects were only rarely taken).

I had the impression the input is junior cert score in irish, score in e, score in m, score in best other subject, score in best other subject.

And that the model was built based on those inputs, along with their l.c. per subject scores. And this is how the model can predict a score for a lc subject for a given jc performance.

This is what I mean by lack of distinction between a ten a1 jc cohort vs 5a1 jc cohort leading to a reduction in dynamic range. Or lack of recognition of the really high performing cohorts.


----------



## SPC100 (10 Sep 2020)

Statement from the institute 96p.c. of students marked down. 40p.c. of grades marked down (vs national of 17p.c.). 









						96% at Dublin grinds school have LC grades reduced
					

Almost every pupil at a Dublin grinds school has had at least one grade reduced in the Leaving Cert with 96% affected.




					www.rte.ie


----------



## Duke of Marmalade (10 Sep 2020)

SPC100 said:


> If it is a true statement that the school a student attends affects their results then the lack of modeling for a school effect will of course reduce accuracy.
> 
> The current model is based on averaging in all national past achievements between jc and lc.


It's not so much the school affects the students results it is the students who select the schools which affect the school's results.


----------



## Duke of Marmalade (10 Sep 2020)

SPC100 said:


> Duke Using your working example, wouldn't we expect Irish schools to have an unfair advantage?
> 
> Edit:Assuming their english and maths is like everyone else.


Yes,  I expect an unfair advantage in all other subjects and an unfair disadvantage in Irish.  Maybe the unfair advantage is outweighing the unfair disadvantage and that is limiting the complaints.


----------



## Duke of Marmalade (10 Sep 2020)

SPC100 said:


> Re 200pc in german/1.2 multiplier in your example - are you sure the models input is subject aware. Iirc the tech docs said that every additional subject made modeling too hard (especially as many subjects were only rarely taken).
> 
> I had the impression the input is junior cert score in irish, score in e, score in m, score in best other subject, score in best other subject.
> 
> ...


Yes that's my read of it.  The two clear anomalies that it throws up are:
Schools with a particular aptitude for one subject get that aptitude averaged across all subjects.  That on some very theoretical distribution could mean that overall CAO points are relatively unaffected.  But I suspect that the loss of grades on the good subject is not compensated by the diluted gain spread on the other subjects.  I presume that is the case with Kilians or they shouldn't have complained.
If a school is particularly placed to improve on ability implied by Junior Cert then standardisation by JC is unfair to them.  There are grounds for believing grind schools are in this category.


----------



## Duke of Marmalade (10 Sep 2020)

SPC100 said:


> Statement from the institute 96p.c. of students marked down. 40p.c. of grades marked down (vs national of 17p.c.).
> 
> 
> 
> ...


Wow!  Now I suspect there is a mixture of two effects.  As a highly commercialised entity I would suspect that Teachers Assessments showed even more grade inflation than usual and they deserve to be punished for this.  But more relevant is that JC performance is a poor indicator of the Institute's historic performance at LC.  This is not going to end happily.


----------



## SPC100 (10 Sep 2020)

I think you missed my point.

Let's assume In a high performing school most of the students will have had 2 As in junior cert outside of Irish english maths.

Then assume that school is e.g. killian's and they now had 3 jc As per student.

The schools actual multiplier is no better. As the model only used top two. And doesn't know what subject it was scored in.


----------



## Duke of Marmalade (10 Sep 2020)

SPC100 said:


> I think you missed my point.
> 
> Let's assume In a high performing school most of the students will have had 2 As in junior cert outside of Irish english maths.
> 
> ...


I accept your point and that's why I said "yes".  I suppose I shouldn't have repeated my two central arguments which are separate from this particular point.


----------



## SPC100 (10 Sep 2020)

Thanks for confirming. Yes, that and the speculations comment previously threw me.


----------



## SPC100 (10 Sep 2020)

Duke of Marmalade said:


> It's not so much the school affects the students results it is the students who select the schools which affect the school's results.



I think it's likely both. Obviously student likely has a larger effect. But it's easy to see how school can influence output (e.g. poor teacher, more homework, exam focus)


----------



## SPC100 (10 Sep 2020)

Article about gender bias https://www.irishexaminer.com/opinion/commentanalysis/arid-40045100.html


----------



## SPC100 (10 Sep 2020)

Provisional Results of Calculated Grades for Leaving Certificate, Leaving Certificate Vocational Programme and Leaving Certificate Applied 2020
					






					www.gov.ie
				




Appeal  process implies class rank should have been maintained, and same school assesd mark should result in same final mark.

Iirc the school letter implied this was not the case for some students.

"Data checks will include a check to ensure that the rank order of the class group for the subject and level taken has been preserved in the standardisation process and that students placed on the same school-estimated mark in the same subject and at the same level taken by the school are conferred with the same calculated mark conferred by the department."


----------



## SPC100 (10 Sep 2020)

Quote from the letter: "How come some students awarded the same calculated mark can be given 2 entirely different grades where there is a deviation of 2 grade levels?"

 so it seems something else unanticipated hapened here. The modelling came up with a very different answer for students with the same teacher estimated mark. But according to documentation, the model should have had the same effect.


----------



## SPC100 (10 Sep 2020)

https://instituteofeducation.ie/wp-content/uploads/2020/09/An-Taoiseach-Michael-Martin-TD-2.00.pdf
		


Some data in the appendix


----------



## Duke of Marmalade (11 Sep 2020)

SPC100 said:


> Provisional Results of Calculated Grades for Leaving Certificate, Leaving Certificate Vocational Programme and Leaving Certificate Applied 2020
> 
> 
> 
> ...


I can’t understand how this happened. Maybe clerical error.
BTW the national  grade inflation is stated as 4.4%.  But how is this calculated?
Update:
I see on the DoE website that the 4.4% is not really grade inflation at all it is the increase in *marks*.  Even that is a tad ambiguous.  It either means, say, that average marks went up from 60% to 64.4% or from 60% to 62.64%.
I don't know how to calculate a grade inflation.  I can calculate a concept of CAO points inflation.  For high level maths this was 6.5%.


----------



## Duke of Marmalade (11 Sep 2020)

SPC100 said:


> If it is a true statement that the school a student attends affects their results then the lack of modeling for a school effect will of course reduce accuracy.
> 
> The current model is based on averaging in all national past achievements between jc and lc.


That's the bottom line.  The standardisation gave no credit for the fact that the school you went to may have been a factor in your progress from JC to LC - that schools actually matter.  Of course there are those who would welcome this, in whose view it is basically unfair to be able to "buy" a better chance at LC.  These same people would have welcomed the temporary suspension of the advantage for paying for private health insurance.  At least this latter group got their money back!


----------



## joe sod (11 Sep 2020)

I think the shenanigans over predictive grading has shown one thing we should never ditch the leaving cert and a big exam at the end of school , it is the only fair way to assess students. We risk completely downgrading our education system and demotivating good students when the system downgrades them based on statistical models than on their actual ability and knowledge
The government took the path of least resistance, they should have continued with the exam in June as planned rather than the silly initial idea to have it at end of July.


----------



## llgon (14 Sep 2020)

Duke of Marmalade said:


> There are basically two inputs:
> (1) The Teachers' Assessments
> (2) Predictions from Junior cycle f



Great analysis Duke. I just have one question for you; did the marks the teachers gave their pupils have any effect at all on the results?

From my reading of it the pupil's subject ranking within the school appears to the most significant factor before standardisation. I'm not clear of the significance, if any, of the mark given.


----------



## Duke of Marmalade (14 Sep 2020)

llgon said:


> Great analysis Duke. I just have one question for you; did the marks the teachers gave their pupils have any effect at all on the results?
> 
> From my reading of it the pupil's subject ranking within the school appears to the most significant factor before standardisation. I'm not clear of the significance, if any, of the mark given.


You’re right that the rank decided everything.  But I think the rank followed the marks.


----------



## llgon (14 Sep 2020)

So the teachers/schools could have given a rank without a mark and the outcome would be the same.  Most of the discussion in the media is about how the teachers' marks differed from the results where the reality seems to be that the teachers marks are pretty much irrelevant. 

It is somewhat ironic that the students were given their marks today but not their ranking.


----------



## SPC100 (15 Sep 2020)

I don't think that's true.

I guess the model accepted anything within a certain window of performance (based on your classes jc results), and only adjusted marks if outside that window.

Most grades were not changed.


----------



## Duke of Marmalade (15 Sep 2020)

llgon said:


> So the teachers/schools could have given a rank without a mark and the outcome would be the same.


No.  Let me try and explain why ranking was needed to allocate the final calculated scores.
Let's say it is thought appropriate to give a school 50% recognition for its teachers' assessments (TA) and 50% recognition for its Junior Cert prediction (JCP).
The obvious first attempt would be to give each student the average of their own TA and their own JCP.  Two problems with that. Firstly not everybody will have a JCP.  But much more seriously each student's LC outcome would be 50% tied to how they did in JC.  This was rejected as wholly unfair and inappropriate and I fully agree with that.
But this would not be unfair in aggregate at the school level if the school was sufficiently big.  So what they did was assume that in aggregate the school would perform 50% in line with the TA and 50% in line with the JCP.  But not just in average outcome but in the whole shape of how individual scores would be spread about that average.  This is what the statisticians call a "distribution".
Having thus determined the school's aggregate calculated distribution it remains to allocate calculated marks to individual students.  This is done by slotting them into the distribution based on how they ranked in their TA.
So note that the inflation inherent in TA remains in this model at least to the extent of 50%.  That inflation comes from marks not from ranking.
Of course one might have contemplated using 100% JCP, in which case you would be correct and the teacher ranking would be all that would be needed.  It would certainly have eliminated grade inflation.  Perhaps one for COVID 20


----------



## NiallSparky (15 Sep 2020)

joe sod said:


> We risk completely downgrading our education system and demotivating good students when the system downgrades them based on statistical models than on their actual ability and knowledge



This essentially happens every leaving cert year anyway. Results are made to fit a normal distribution, so students get moved around based on statistical methods.


----------



## Duke of Marmalade (15 Sep 2020)

NiallSparky said:


> This essentially happens every leaving cert year anyway. Results are made to fit a normal distribution, so students get moved around based on statistical methods.


That is not correct.  There is an element of standardisation of the initial marks so as to keep a reasonably level standard year on year - students are not allowed to benefit en masse from an easy maths paper, say.  But this standardisation is done on a uniform basis with no attempt, intended or otherwise, at social engineering.


----------

