Statistical Crap

Speaking of data (as in the previous post), the members of the American Statistical Association (ASA) probably know something about that topic. And they recently released a statement about the Value-Added model (VAM) of teacher evaluation, a very popular reform among those top-level education experts we all love and respect.1

In theory, VAM is supposed to measure how much value a teacher has added to the learning of their students, using standardized test scores and some complex mathematics that is supposed to exclude other “non-relevant” factors from the final numbers. Many districts and states (including Virginia, to a small degree) are using or planning to use some variation of VAM as a teacher evaluation tool and to determine continued employment, pay raises, tenure, even whether to close schools.

The ASA statement is, as you might expect, very academic in it’s assessment of VAM but they are still quite clear in their conclusions that this system is… how shall we put this? – statistical crap (my very non-academic interpretation of their 7 page report).

A few very relevant statements from the executive summary.

VAMs are complex statistical models, and high-level statistical expertise is needed to develop the models and interpret their results.

Expertise that is quite lacking in most schools, not to mention in pronouncements from supporters of this concept.2

Estimates from VAMs should always be accompanied by measures of precision and a discussion of the assumptions and possible limitations of the model. These limitations are particularly relevant if VAMs are used for high-stakes purposes.

Too many advocates of VAM consider the numbers as fact, not “estimates”, and are not open to any “possible limitations”.

And my favorites,

VAMs typically measure correlation, not causation: Effects — positive or negative — attributed to a teacher may actually be caused by other factors that are not captured in the model.

Under some conditions, VAM scores and rankings can change substantially when a different model or test is used, and a thorough analysis should be undertaken to evaluate the sensitivity of estimates to different models.

Most VAM studies find that teachers account for about 1% to 14% of the variability in test scores, and that the majority of opportunities for quality improvement are found in the system-level conditions. [my emphasis]

In other words, teacher quality, while important, is only one factor to consider in the very complex process of student learning, and the far-less-than-perfect method of assessing that learning, standardized test scores.

That “majority of opportunities for quality improvement” will only come from making systemic changes to educational policies at the district, state, and national levels.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.