It’s hard to turn on the news or open a paper without finding a story, usually negative, about school test scores and the "dismal" state of public schools. Unfortunately, these articles usually are based on the "easy" numbers, the averages, the statistics that the reporter can easily fit into a graphic. Jay Mathews in his weekly online column joins educational gadfly Gerald W. Bracey in suggesting that people, especially reporters, need to dig deeper.
[Bracey’s] article, "Simpson’s Paradox and Other Statistical Mysteries," [in the February issue of the American School Board Journal, not yet available online] exposes a great gap in our coverage of test score results. With great regularity, mainstream newspapers like mine, as well as popular magazines and the big networks, report on the lack of improvement in our public schools. We use words like "stagnant" or "sluggish" or "static" or "flat" to describe the achievement levels as measured by the National Assessment of Educational Progress (NAEP), the federal government’s most important and most respected measure of U.S. schools.
Although there is much more to the concept, Simpson’s Paradox occurs when the scores of the group as a whole goes one way while the scores of the subgroups go a different way. In the case of scores on the NAEP and the SAT, Bracey shows that while the average score of a student on these tests might be "flat", scores of various minority groups have shown good improvement over the past twenty or more years but not changing the overall average much. One example used by Bracey is the NAEP Reading test.
On the NAEP reading test, for instance, non-Hispanic white 17-year-olds had only a small improvement. They went from 291 points to 295 points, while the overall average went from 285 to 288 points. But African Americans in that same period jumped 26 points, from 238 to 264, and Hispanics increased 19 points, from 252 to 271.
So, the scores of the largest group taking the test, whites, we’re essentially "flat", while the scores of two large groups of minority students made statistically good improvements. However, the average of all scores is still essentially flat due to the scores of the majority. That’s the paradox. And considering the emphasis and spending that’s been devoted to improving minority test scores over the past twenty years, it looks like we’re getting what we paid for.
My explanation is far too simple to fully understand what’s going on. Read Mathews’ article to get a much better perspective of what Bracey is talking about and as soon as I can I’ll link to Bracey’s article. But in all this please keep in mind that neither Mathews or Bracey (or me for that matter) is saying that everything is fine with public schools and test scores, so the critics should back off. However, No Child Left Behind has placed major emphasis on making sure that all ethnic, socio-economic and other subgroups of students achieve at high levels. If that’s going to happen, people (and the reporters who write for them) must get past their math phobia and make a serious attempt to understand the statistics that lie behind the simple averages in the headlines.