Test-Retest (Un)reliability

After the relatively moderate amount of drama that occurred during the first sitting of the Big Test with my two elementary classes, I sat down to mark the children’s responses to my relatively fair test. I knew that some children excel to a greater extent than others – those that paid attention in class would, I thought, do better than those who didn’t. I was not expecting some of the results that came out, nor the response of the school to these results.

As expected, the marks were linked to how much the students paid attention in class. The student who paid the most attention in SPK1 (my first elementary class each day) achieved the highest mark. From there, the results were not to my expectations. I had expected my most distracted student to do the least well, but he did not. Instead, a student that I thought would do better managed to fail miserably, while my most distracted student just passed. The marks were slightly lower than what I had expected, and I was disappointed. From my Review class (the one with just Jason and Chris), the marks were, overall, slightly better. They didn’t excel, but they didn’t fail either. With a slightly puzzled feeling, I went to my Korean supervisor, Liz, to inform her of the results and seek her advice as to what to do. Apart from the marks, one student had yet to write the test at all. I awaited her wise response. I was not entirely prepared for it.

She too was disappointed with the results from SPK1, and gave me one simple task – make everyone from SPK1 (except for the student who achieved the highest mark) write the test again. This would give the student that hadn’t written the test the chance to do so, and would give the two students who had done poorly an opportunity to improve their scores, apparently. I was bemused at this. This had happened before with my Review class when they did poorly on a small in-class test, so I was not entirely caught off-guard. I had thought that the Big Test would be different though. I thought it was named accurately – that it was to be the most accurate assessment of the children’s abilities, and that the results would be near-sacred and untouchable. I guess not.

I did not even have time to go through the test results or the test itself with the children. The next lesson I saw them, they simply had to repeat the Big Test. The exact same one. No changes, no new questions to assess their skills. Simply rewind, retake, re-do. Or, in the case of the student who hadn’t taken the test before that point, do for the first time while the other students were griping about having to re-do it.

So, when they arrived for class yesterday after their (hopefully) enjoyable long weekend, they were met with Liz and I. Liz gave them what sounded like a stern talking to in Korean. She then instructed me to hand out the tests once more, and I did so. The children were in varying states of despair. The student that had yet to write the test and the one that didn’t have to write it were relatively chipper and upbeat. The two students who did poorly looked like they were sizing up who would take the first turn in trying to squeeze themselves out of the tiny window in my classroom and fall eight floors into the abyss. Nevertheless, they all completed the test, with varying degrees of effort and concern put into their answers. The student who had previously done the worst seemed beyond caring. He asked for my assistance with a few questions, but did not seem overly interested in my help. The student who only barely passed the last test put in slightly more effort, asking for me to read numerous questions to make sure he understood what they were asking. And the student that had yet to write the test was clearly the most studious, working independently unless they were stumped and did not know what the question was asking.

And, once again, the marks reflected this. The student who didn’t put the effort into the test bombed again, only achieving one half-mark more than the previous attempt. The student who just passed the first time round managed to improve their mark slightly. And the student who had yet to write the test achieved the highest mark in the entire class.

It makes little sense to me why the students had to re-take the test. I would understand if they were all in emotional states entirely unconducive to testing, but none of them were. There was the normal childhood unwillingness to be tested, but nothing equal to the table-flipping antics of Jason. So the students achieved what I thought they would – there may have been small improvements because they got a stern talking-to by Liz which caused them to focus slightly more for the first part of the test. There would have been far more improvement if I could have gone through the test and the mistakes made in it, and give the children a new, similar test. This was not the way of doing things, though. So, we tried to make the best of an imperfect approach.

There were small improvements, but nothing to be overjoyed about. There were lessons to be learned, and I for one will try to learn those that I need to. As for the kids? Here’s hoping that they don’t have to write every single test they are given multiple times. When they are in our school for such little time as it is, to spend two full days on tests will take away from things that they could be learning. And, ultimately, learning should be the goal, not simply bumping up test marks so that the parents will be less grim.


One thought on “Test-Retest (Un)reliability

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s