Wednesday, March 8, 2006

Verification statistics

Steve McIntyre is having a lot of fun.

As we discussed many times, the fundamental scientific statement that is used to justify various global policies to fight the so-called "global warming" is the conjecture that the warming in the 20th century is unprecedented. The primary experimental evidence is based on the reconstruction of temperatures in the past millenium.

We did not have thermometers 500 years ago. Instead, we must use "proxies" such as tree rings etc. The hypothesis behind this scheme is that a good estimate of the past temperatures can be obtained as a particular linear combination of vectors of numbers extracted from these proxies. You try to find the right linear combination that optimally reproduces the observed temperatures in the calibration period (probably something like 1850-2000) and then you extrapolate the same linear combination of the proxies to guess the temperatures in the past, before we had any thermometers.

Can this procedure be trusted? In order to answer this question, you need verification statistics, a certain kind of generalized correlation coefficients for multi-variable linear regression. Steve McIntyre and Ross McKitrick have shown in their papers - especially the latest paper in Geophysical Research Letters - that the statistical procedures used by Mann, Bradley, Hughes (MBH98, MBH99) in their "hockey stick" papers are flawed. Quantitatively, this fact shows up through very poor values of the R2 verification statistic.




Although a theoretical physicist would always prefer the R2 statistic, there also exist alternative formulae to quantify the quality of a "model", such as the RE statistic. In all cases, these numbers are between 0 and 1, with a value below 0.2 indicating a poor model. In previous climate papers, R2 was widely used. However, because it turns out that the R2 coefficient may be very low for various reconstructions, R2 suddenly became politically incorrect and some climate scientists even argue that it is "silly" to calculate R2 and only RE should be looked at because of something and especially because its values are higher.

Because Ross McKitrick and Steve McIntyre published a paper that has shown that the results of MBH are statistically insignificant and because the global warming and the hockey stick is a kind of dogma for a certain segment of the climate scientists, they have spent a significant portion of the last year or two by attempts to create and publish a paper that would invalidate the results of McKitrick and McIntyre. Otherwise, the state-of-the-art situation is that the hockey stick reconstruction has been proved to be an artifact of flawed statistical methods.

The paper of Ammann and Wahl could have become such a paper that could potentially save the most important part of the global warming theory. However, it turns out that according to Ammann and Wahl, the R2 verification coefficients for the early stages of the MBH paper are extremely low, just like McKitrick and McIntyre argued. The debate on that page attracted some people who are well educated in statistics. A typical interpretation of a low squared statistic combined with a higher RE statistic is that they deal with overfitting - the "model" for calculating the past temperature depends on too many variables. At any rate, the predictions can't be trusted. The RE statistic is spuriously high only due to self-correlations of the proxies in the calibration period.

It seems that once you analyze papers that were proposed as evidence for "extraordinary" warming in the 20th century, you will see that they are based on estimates of the temperature in the past millenium that look like worthless noise and guessing. You won't read these mathematical analyses in the media. Instead, the media will offer you irrational and hysterical whining of politicized scientists, politicians, and polar bears.