Friday, 28 January 2011

Chopping and changing

The new postscript to The Spirit Level finds Wilkinson and Pickett accusing their critics of “selectively removing countries on the grounds that they were outliers.” Outliers do indeed play an important part in several of The Spirit Level’s graphs. The correlation between inequality and homicide rests entirely on the USA being an extreme outlier. The correlation between inequality and obesity depends entirely on Japan and the USA being outliers (as well as the exclusion of Singapore, Hong Kong and South Korea, all of which have similar rates of obesity to Japan). The correlation with trust depends entirely on the Nordic nations being outliers.

The significance of this should not need underlining. To take homicide as an example, there is no evidence of a relationship between inequality and homicide when 22 countries of the countries are studied. The 23rd country—the USA—has a much higher rate and pulls the regression line upwards dramatically. Using this distorted regression line as evidence that inequality causes murder means ignoring the data from 22 countries in favour of data from just one. There are many reasons why the USA has a high murder rate, but if inequality was the root cause, we would expect to see it affecting the other countries. It doesn’t, and excluding the USA as an outlier demonstrates this.

If we were presented with a graph showing low levels of participation in baseball in 22 countries but a much higher figure for the USA, few of us would conclude that there was a true causal relationship between inequality and baseball. Americans just play a lot more baseball. And yet, for several of The Spirit Level’s graphs, outlying data of this type are used as proof of a causal relationship despite the great majority of the countries being totally unaffected by the supposed cause.

Wilkinson and Pickett feign ignorance about the importance of outliers. In their postscript, they portray testing for outliers as an underhand trick to exclude unfavourable data. It is, of course, nothing of the kind. The point of testing for outliers is not to “selectively remove countries” and then present the result as the ‘real’ graph, but to see if the relationship holds up without the outlier being present. In Beware False Prophets, Peter Saunders explains how and why statisticians use box plots to identify outliers. He then shows, as I do in this book, that the trend line for homicide is being thrown out by a single extreme outlier.

It is fantastically implausible to think that Wilkinson and Pickett are not aware of the importance of outliers in statistics. In fact, we know that they are because when they find a reasonably strong statistical relationship (for rates of imprisonment) they write: “Even if the USA and Singapore are excluded as outliers, the relationship is robust among the remaining countries.” They make no such guarantee of their other graphs, for the simple reason that they are not robust.

One of the dangers of not testing for outliers is that your trend line will become skewed and no longer reflect reality. Wilkinson and Pickett focus on their trend line to such an extent that they forget what the actual data are telling them. In the last chapter of The Spirit Level, Wilkinson and Pickett claim that if Britain reduced income inequality to the same level as Sweden, Finland, Japan and Norway, its murder rate would fall by 75%. This prediction goes far beyond what the data show. (Even if the association was real, their correlation coefficient tells them that inequality accounts for less than half the difference, and yet they assume it accounts for 100% of the difference—a very basic error.)

Worse still, they are basing their prediction entirely on their trend line, which tells them that Britain should have a much higher murder rate than it does. But that trend line has become hopelessly skewed by the USA. Britain actually has a lower murder rate than Sweden and Finland and has a lower murder rate than the average of those four ‘more equal’ nations.

The irony of Wilkinson and Pickett accusing their critics of picking and choosing which countries to study will not be lost of readers of this book. Wilkinson was being criticised for his selective use of data long before The Spirit Level hit the shelves. Their justification for confining their analysis to 23 countries is because “these countries are on the flat part of the curve at the top right in Figure 1.1 on p. 7, where life expectancy is no longer related to differences in Gross National Income.” Quite so, and it was that very graph which first alerted me to the fact that Wilkinson and Pickett had excluded several countries. (The image below is a close-up of the richest countries in that graph with GDP increasing from left to right.)

South Korea, Hungary, Slovenia and the Czech Republic all appear on that graph as being as rich or richer than Portugal. It was not me, but Wilkinson and Pickett, who arbitrarily decided that Portugal was ‘rich enough’ to merit inclusion. All I have done in this book is include countries of comparable or greater wealth than Portugal as shown in Wilkinson and Pickett’s own graph. Without a convincing justification for why places like the Czech Republic and South Korea cannot be considered “rich market societies”, we must ask the next question: why do these societies conspicuously fail to fit Wilkinson and Pickett’s theory? The United Nations classes these countries as being of “very high human development”, why doesn’t The Spirit Level?

Their insistence on never having “picked problems to suit our argument” is rather undermined by, for example, their focus on public foreign aid at the expense of private aid, or by their emphasis on imprisonment rather than crime. Their claim to “never pick and choose data points to suit our argument” is at odds with references 2 and 6 in The Spirit Level which show one year’s data being used for one graph and another year’s data being used for the next, even though the subject matter—life expectancy—is the same.

As for using “the same measures of inequality” (as they said they did in an article in Prospect magazine), they address this early in The Spirit Level, saying:

To avoid being accused of picking and choosing our measures, our approach in this book has been to take measures provided by official agencies rather than calculating our own.

This is no great claim to integrity. It would be very odd if they started developing their own bespoke measure of inequality. But if they really wished to “avoid being accused of picking and choosing” they would have used the same official measure throughout. In fact, they use no fewer than five different measures of inequality in The Spirit Level.

Having correctly explained to the reader that the Gini coefficient is “the most common measure” which is “favoured by economists”, they proceed to ignore the Gini in favour of comparing the top and bottom 20% when making international comparisons. They then switch to the Gini coefficient when looking at US states and then use a completely measure when comparing working hours (p. 229). They then adopt a measure which compares the bottom and top 10% (p. 240) and, finally, in their new edition, measure inequality in reference to the top 1% (p. 296).

The effect of this chopping and changing can be seen by comparing the graph on page 240 to the graph on page 296 (of the new edition). The first graph shows that inequality in the USA has fallen since its peak in the early 1990s; the second graph shows that inequality in the USA rose sharply in the 1990s and peaked at the time of the 2008 recession. Wilkinson and Pickett’s aim in the postscript is to demonstrate a correlation between inequality and the financial crashes of 1929 and 2008. They write that “both crashes happened at the two peaks of inequality”. Either they have forgotten, or they are hoping the reader has forgotten, that they wrote in the previous chapter that inequality in the USA “peaked in the early 1990s”.

Whilst there is nothing wrong with using the share of wealth held by the top 1% as a measure of inequality, this is the only time it is used in The Spirit Level. This is unsurprising since under this measure Norway and Denmark are less equal than the USA. It does, however, demonstrate how Wilkinson and Pickett switch reference points to suit whatever argument they are making at the time.


Carl V Phillips said...

Nice analysis. This strikes me as providing a good general rule: If someone changes their statistical methods for analyzing two (or more) comparisons that are similar then they should justify the change; if they seem to just be hoping that the reader does not notice, it is a safe bet that they are lying. (lying = speaking/writing/etc. in such a way that people are likely to believe something that you know to not be true)

Interesting example, baseball. It is also quite popular in Japan and Cuba. Probably the bigger outlier sports (in terms of US participation) are American "football" and basketball. Basketball, in particular, is arguably tied to wealth inequality: When a large part of the population lacks access to most of society's open spaces and facilities, they will gravitate to a sport that just requires a few hundred square feed of pavement, a metal post, and a ball. Even street versions of baseball and futbol, though they require even less equipment, require more safe open space than many of the urban poor have access to in unequal societies.

Konker said...

"The main conclusion of this paper is that income inequality, measured by the Gini index, has a significant and positive effect on the incidence of crime"

"when poverty falls more rapidly, either because income growth rises or the distribution of income improves, then crimes rates tend to fall"

From the World Bank at

Which is a sophisticated, rational, econometrics based analysis and reaches the OPPOSITE conclusion to yours.

Both Inequality and Poverty are correlated with crime.

I believe the World Bank - who by the way, have been criticized for causing poverty and inequality in the past.

To make your date more reliable you should increase the sample size. Also, given most of your countries are European some data tend to cluster and the small spread increases relative errors. Suggest that you either (1) include errors - I suspect it will render some charts inconclusive (2) Include more countries (3) include each state in the USA to avoid counting the USA as a single outlying data point - Do any of these and you will come to the same conclusions as the World Bank....unless of course you are motivated politically rather than an objective researcher

Snowdon said...

So you're portraying a study which has two co-authors who are affiliated with the World Bank as being the official position of the World Bank? Not a good start.

To make your date more reliable you should increase the sample size.

Assuming you mean 'data' rather than 'date', you've missed the whole point of The Spirit Level if you haven't noticed that limiting the sample size is the whole point of their book. If they included poorer countries, the whole thing would fall apart so quickly that even the most gullible punter would see through it.

Poorer countries are often very unequal. Crime is associated with poverty in poorer countries. The Spirit Level looks at rich countries and so do I. Inequality is not associated with crime amongst the countries featured in The Spirit Level. If you feel Wilkinson and Pickett should "increase the sample size" then tell them directly.

Thom A. said...

1)Interesting you should choose an unfree society like Singapore and Hong Kong, are not democratic - are quasi-fascist city-states.

2)Whilst admit there is a few problems
- correlation does not always mean causation, this may be true in some cases; but there is too much evidence to show to not believe a more equal society is a better one.

3) Further more, I notice how you ethnicity as a possible cause of crime. "...effectively disregard other variables such as absolute income, culture, history, ethnicity, geography, law, politics and climate" I am assuming you mean by ethnicity, you mean race! Which only leaves me to conclude you believe people of other races are more likely to cause crime - or is that how it appears?

So I conclude you support fascistic states, and racist policies.

Or is that correlation not mean causation?

Snowdon said...

1) No they're not.

2) No there isn't.

3) No they're not.

Don't be so stupid.

Anonymous said...

Lies, damned lies and statistics...