Foreword by Anthony Watts:
© Dave Stephens
This article, written by the two Jeffs (Jeff C and Jeff Id) is one of the more technically complex essays ever presented on WUWT. It has been several days in the making. One of the goals I have with WUWT is to make sometimes difficult to understand science understandable to a wider audience. In this case the statistical analysis is rather difficult for the layman to comprehend, but I asked for (and got) an essay that was explained in terms I think many can grasp and understand. That being said, it is a long article, and you may have to read it more than once to fully grasp what has been presented here. Steve McIntyre of Climate Audit laid much of the ground work for this essay, and from his work as well as this essay, it is becoming clearer that Steig et al
(see "Warming of the Antarctic ice-sheet surface since the 1957 International Geophysical Year", Nature, Jan 22, 2009
) isn't holding up well to rigorous tests as demonstrated by McIntyre as well as in the essay below
. Unfortunately, Steig's office has so far deferred (several requests) to provide the complete data sets needed to replicate and test his paper
, and has left on a trip to Antarctica and the remaining data is not "expected" to be available until his return.
To help layman readers understand the terminology used, here is a mini-glossary in advance:
RegEM - Regularized Expectation Maximization
PCA - Principal Components Analysis
PC - Principal Components
AWS - Automatic Weather Stations
One of the more difficult concepts is RegEM, an algorithm developed by Tapio Schneider in 2001
. It's a form of expectation maximization
(EM) which is a common and well understood method for infilling missing data. As we've previously noted on WUWT, many of the weather stations used in the Steig et al study had issues with being buried by snow
, causing significant data gaps in the Antarctic record and in some burial cases stations have been accidentally lost or confused with others at different lat/lons. Then of course there is the problem of coming up with trends for the entire Antarctic continent when most of the weather station data is from the periphery and the penisula, with very little data from the interior.
Expectation Maximization is a method which uses a normal distribution to compute the best probability of fit to a missing piece of data. Regularization
is required when so much data is missing that the EM method won't solve. That makes it a statistically dangerous technique to use and as Kevin Trenberth, climate analysis chief at the National Center for Atmospheric Research, said in an e-mail: "It is hard to make data where none exist
." (Source: MSNBC article
) It is also valuable to note that one of the co-authors of Steig et al, Dr. Michael Mann, dabbles quite a bit in RegEm in this preparatory paper
to Mann et al 2008
"Return of the Hockey Stick".
For those that prefer to print and read, I've made a PDF file of this article available here