Thursday, November 8, 2012

Interesting post with important lesson

I just came across this blog post on an apparent relationship between obesity in the UK and Premiership revenues.

As the post shows using a scatter plot, the two look impressively highly correlated, and the author points out that the R^2 is 0.93.  It looks like either the Premiership's success causes more obesity, or the more obese people are, the more successful the Premiership is - not exactly the positive impact on health outcomes we might hope for!

However, further down the post, the author plots both series against time and you can clearly see that they are highly non-stationary - i.e. they trend upwards. The technical lesson to be learnt here is that these are two non-stationary series, and hence any strong correlation between the two will almost be erroneous, or "spurious" - i.e. not really there. That's because the regression model doesn't include a time trend and hence as the other variable closely resembles a time trend, it takes that place.

The less technical but equally important lesson is what the blog author emphasises - correlation does not imply causality.  That's a fundamental lesson to always be aware of. Alone, economic data can tell us nothing other than correlation. Only combined with some economic theory can we start to get any sense of causality.

No comments:

Post a Comment