Sunday, December 8, 2013

International Assessment Conclusions Called Into Question


PISA Conclusions Called into Question

Recently, a report came out regarding the international PISA (Program for International Student Assessment) exam comparing student achievement between countries. This showed Americans’ student achievement as stagnant or making only slight improvements compared to other countries. I am a fan of the PISA assessment, because it asks students to do more than regurgitate facts, which is the focus of current standardized tests in the United States. Alternatively, PISA requires students to demonstrate their ability to think and solve problems. However, I am very concerned about the number of students who actually take the test and the accuracy of the results and conclusions derived from the PISA exam. Here's why.

PISA is administered to a minimum of 150, randomly selected schools in each of 65 countries. All of the grade 11 classified students in these schools take the PISA test. In the United States, with our total population of juniors at approximately 4 million in over 27,000 high schools, this means that 6,000 juniors in 150 schools took the test. This represents 0.15% of the total population of eligible juniors and 0.59% of all high schools in America. This number of test takers is insufficient, according to statistical methods, to draw generalizable conclusions from the data. This means that the data derived from the test cannot be used to draw conclusions about overall achievement in the United States.

Comparatively, Finland has a population of 5.4 million people. They have approximately 70,000 fifteen year olds who could take the PISA test. These students attend approximately 300 high schools. If 150 schools in Finland are required to take the PISA test, that is a sample size of 50%. Logically, this represents approximately 35,000 students, which is 50% of the total population of fifteen year olds in Finland. This is a very strong sample size and the results are certainly generalizable to all of the fifteen year olds in Finland.

China has an overall population of over 1.3 billion. Japan's population is 127 million. Korea has a population of 50 million. How can we draw comparisons about student achievement between countries if the sample size of students taking the test is statistically insignificant compared to the number of potential participants?

Certainly, those who examine the data closely will determine that some countries have more than 150 schools who take the test. For example, over 900 schools in Canada participated. And with an overall population of 35 million, 900 schools is a nice sample size.

In sum, 510,000 students in 65 countries representing approximately 28 million eligible fifteen year olds in the world took the test. This is a sample size of 1.8%.

Are you comfortable making a judgement about anything with evidence from 1.8% of the population? In Michigan, 1.8% of fifteen year olds represents 2,500 of our 142,000 fifteen year old students. However, with the American percentage of participants at 0.15%, only 213 of the eligible fifteen year olds in Michigan took the test (although PISA will not disclose how many of Michigan’s students or schools actually participated). And 0.59% of Michigan high schools represents 4 schools. Are you comfortable making a conclusion about the skills of our fifteen year olds based upon a sample size of 0.15% of the population? That's less than 1 in 100. That’s 15 out of 10,000 or 150 out of 100,000. Are you comfortable stating that the achievement of fifteen year olds in Michigan is stagnant based upon a test given in 4 schools? How many kids at your local high school took the PISA test? Not one Clarkston student took it. Not one.
I like the PISA test. I think it is a big improvement over the standardized tests our students currently take in Michigan. I think our state should consider using The PISA assessment with a statistically significant percentage of the population of fifteen year olds and then consider the results in comparison with other countries who have a statistically significant number of students take the test. Until then, I think it is negligible to draw any conclusions from such a small sample.