A (stupid) $100000 bet

I was going to ignore this, but since this is a blog and I’ve nothing better to write about, I thought I would comment on Doug Keenan’s $100000 challenge. If you want some insights into Doug Keenan, Richard Telford’s blog is a reasonable place to start. I’ve also written about some of his antics.

So, what is his big challenge? Well, it appears to be to identify (with 90% accuracy) which of his 1000 time series were simply random, and which have had a trend added to them. Doing so would, according to Doug

demonstrate, via statistical analysis, that the increase in global temperatures is probably not due to random natural variation.

Really? No, this is just silly. Doing so would simply demonstrate that one can identify which of a set of randomly generated time series have had a trend added to them. It will tell you absolutely nothing as to whether the increase in global temperature is due to random natural variation or not. If you want to establish this you would need to base your analysis on what could cause changes to the global temperatures, and try to establish the most likely explanation for the observations. You cannot do it using statistical analysis alone. This should be obvious to anyone with a modicum of understanding of the basics of data analysis.

We are also pretty certain that it’s not simply a random walk. You can read Tamino’s post if you want an explanation as to why it isn’t a random walk. We understand the energy flows pretty well, and the idea that global temperatures could randomly drift up, or down, is simply bizarre and represents a remarkable level of ignorance.

In principle, I should just ignore this as being silly, but it is actually particularly frustrating. It is very obviously complete nonsense. Anyone who promotes this, or sees it as somehow interesting, or worthwhile, is either somewhat clueless, or particularly dishonest. There isn’t really a third option. This is the kind of thing about which there should be little disagreement. Not dismissing this challenge as silly is why I object to the idea that there aren’t really people who deserve to be labelled as climate science deniers, since it is the epitomy of climate science denial. It’s why I think we need a better class of climate “sceptic”.

There are, however, some conclusions one might be able to draw from this whole episode

  • Doug Keenan has promoted this basic idea on numerous occasions and, on numerous occasions, people have explained why it is nonsense. Maybe Doug Keenan is simply particularly dense.
  • Possibly Doug Keenan thinks that all mainstream climate scientists are particularly dense and that they’ve been doing something fundamentally wrong for decades. However, given that they haven’t actually been doing what he seems to think they’ve been doing, this means that – at best – he’s simply savaging a strawman, and returns us to the point above.
  • Doug Keenan knows exactly what he’s doing, but thinks that all the climate science deniers to whom he’s trying to appeal are simply particularly dense. I might be insulted if I were them, since it’s pretty obviously nonsense. Of course, those who think this is a good idea would probably deny being climate science deniers, putting them into some kind of infinite loop of denial.
  • Some combination of the above.

Whatever conclusion one might draw, it does seem to put Doug Keenan’s comment into a different light. When he said

the best time-series analysts tend to be in finance. Time-series analysts in finance generally get paid 5–25 times as much as those in academia; so analysts in finance do naturally tend to be more skillful than those in academia—though there are exceptions.

I assumed he meant that there were exceptions in academia. I hadn’t appreciated that he might have been talking about himself. That seems more plausible, given recent evidence.

This entry was posted in Climate change, ClimateBall, Comedy, Science and tagged , , , , , , . Bookmark the permalink.

134 Responses to A (stupid) $100000 bet

  1. Hautbois says:

    Plus stubbornness, and maybe some entrenched ‘my field is the best way to explain anything’ variety of discipline rivalry.

  2. Maybe, although the idea that time series analysis alone can tell us much, would suggest a significant lack of understanding of basic data analysis and why we do it.

  3. Joshua says:

    None of you conclusions seem particularly plausible to me.

    I doubt that he’s dense of that he thinks that all climate scientists are dense. I don’t think that he thinks that “skeptics” are dense.

    Just because someone’s smart and/or knowledgeable doesn’t mean that they are particularly good at controlling for influences such as confirmation bias. I think that he’s a smart guy, who probably believes that he’s uncovered a way to show that mainstream climate scientists are (for the most part) biased in their thinking. He can feel good about himself for being so much smarter than other smart people, and he can draw conclusions that reinforce his ideological orientation.

    ==> “In principle, I should just ignore this as being silly, but it is actually particularly frustrating. ”

    I get why it’s frustrating…but this is the reality of the climate wars. Not to suggest that I’m all Zenlike myself, but probably the frustration comes from a hard to get rid of (unrealistic) hope/expectation that more people will overcome their biases.

  4. None of you conclusions seem particularly plausible to me.

    Ahh, I wasn’t being entirely serious.

  5. Joshua says:

    Hautbois –

    ==> “Plus stubbornness, and maybe some entrenched ‘my field is the best way to explain anything’ variety of discipline rivalry.”

    Yeah, that also seems to me like a plausible (partial) explanation.

  6. Joshua says:

    ==> “Ahh, I wasn’t being entirely serious.”

    Sorry I missed that. In my defense, your list of conclusions is a pretty good match for the explanations offered by a pretty large % ‘of climate combatants for why people disagree with them about climate change.

  7. I’m sure there are many explanations. I’m struggling to think of any that paint some kind of positive picture, though. The problem I have with the bias suggestion is that some of this is so obviously silly/wrong, that it’s very hard to see how even a bias can have that kind of influence. Normally a bias would produce a tendency to go in a particular direction, not promote something that even a small amount of basic understanding would indicate was horribly wrong.

  8. Keenan persists in neglecting the first step of any model selection procedure: determine which possible models are physically plausible.

  9. Richard,
    Indeed, that fairly simple step seems to be a bit beyond him.

  10. Joshua says:

    ==> ” The problem I have with the bias suggestion is that some of this is so obviously silly/wrong, that it’s very hard to see how even a bias can have that kind of influence.”

    I think it depends on your starting premises. Kind of like how the Earth being 6,000 years old can be a logical viewpoint if your starting premise is that the bible is the word of God. Or that 2 + 2 = 5.

    ==> “Normally a bias would produce a tendency to go in a particular direction, not promote something that even a small amount of basic understanding would indicate was horribly wrong.”

    I can’t really contextualize your position there to this particular statistical framework (due to my intellectual and informational limitations)…but….while I get how your logic makes the “bias explanation” seem implausible, I think it may well be the least implausible of explanations.

    To use the context of creationism again; what’s the explanation for why some smart, knowledgeable scientists have a completely different view than the mainstream on the viability of creationism or Intelligent Design?

    There is a lot of evidence that shows an influence of ideological viewpoint on how people diverge in interpreting evidence that they think is very conclusive. Put that together with DK (for example, Tea Partiers thinking that they are particularly well-informed about climate change in association with:1) having more extreme views on the subject relative to, even, mainstream Republicans and 2), displaying basic ignorance about many climate-related phenomena), and you have a pretty powerful explanatory force in play.

  11. Joshua says:

    BTW – an interesting link (IMO) on DK…

  12. Joshua,
    Maybe you’re just seeing this differently to me. You seem comfortable describing potentially smart people who believe in creationism as being biased. I might only agree with that to be polite.

  13. izen says:

    Denier zombie memes never die!
    It is deja vu all over again…

    Hasn’t the whole blind test of data to detect whether it is random/trended all been done before? Here is one from 2009 trying to debunk the pause, but I am sure I remember a previous version of giving the data blind to statisticians to judge and getting back the same answer – there IS a trend.

    http://www.yaleclimateconnections.org/2009/10/borenstein-reports-statisticians-reject-global-cooling-line/

    “Borenstein reported that in a blind test with their not knowing what the temperature data numbers stood for, the statisticians “found no true temperature declines over time” despite last year’s having been cooler than previous years.”

    Showing unequivocally that the temperature data is NOT random and contains a trend is clearly not a head shot.

  14. The Very Reverend Jebediah Hypotenuse says:


    Time-series analysts in finance generally get paid 5–25 times as much as those in academia; so analysts in finance do naturally tend to be more skillful than those in academia—though there are exceptions.

    Clearly, it would be in our best interests to get the executive board members from Fortune 500 companies to do everything that requires any skill.

    That will be really expensive – so everyone else will have to do all the unskilled work for nothing.

    It’ll be Tea-Party heaven.

  15. Willard says:

    > In principle, I should just ignore this as being silly […]

    And then there’s ClimateBall ™.

  16. izen says:

    @-Joshua
    ” I think that he’s a smart guy, who probably believes that he’s uncovered a way to show that mainstream climate scientists are (for the most part) biased in their thinking. ”

    Yes, supporting things our host describes as “very obviously complete nonsense”, “simply bizarre and represents a remarkable level of ignorance.” and “being silly” does not mean that Keenan is not smart. I find ATTP’s assumption it indicates density unconvincing.

    Looking at the site and the number series provokes a suspicion. The numbers are claimed to be “..generated as follows. First, 1000 random series were obtained …”

    Now ‘random’ can have a rather precise meaning, but it then adds in brackets –

    “(via trendless statistical models fit for global temperatures).”

    So it is a ‘random(?)’ sequence which is then modified to fit some unspecified autocorrelation pattern which are claimed to ‘fit’ global temperatures. It is beyond my maths chops but there is much discussion I gather about the various qualities of ARMA(3,1,0) etc.

    Next the challenge states-

    “Then, some randomly-chosen series had a trend added to them. Some trends were positive; others were negative. Some trends were deterministic; some were not. ”

    Clever, some could mean 2 out of the 1000, or 999. I have no idea what is implied by the difference in deterministic and not, trends. Finally

    “Each individual trend was 1°C/century (in magnitude)—which is greater than the trend claimed for global temperatures.”

    Perhaps not as this year looks to break the 1degC record. I wonder if the trends added to the randomly-chosen series cover the whole of a 135 sequence, or just a small part of it…

    I would assume that this 1000 sequences has already been checked by Keenan with the standard stats packages and he has assured himself that they do not give a winning answer, because he has rejected the many more sequences originally generated that he rejected by that method.
    Otherwise I would have to conclude ATTP is right!

    However perhaps it would be interesting to take the ACTUAL historical temperature record and see what differences exist between a few of Keenan’s random(sic) sequences and the real thing.

  17. Joshua says:

    Anders –

    ==> “How do we explain the beliefs of someone like Ben Carson? “

    I think that his accomplishments may be somewhat overblown, but I don’t think that someone flat out stupid could have attained his credentials. Full of shit about what he believes? Could be, but he’s been a 7th-Day Adventist for an awful long time for it to be plausible that it’s all just a cloak he puts on to garner evangelical support. Thinks all scientists are dopes? Hmmm. Could be, but I would think that over the course of his work and academic experiences he’s run across a lot of people who think differently from him on evolution but whose intellect he nonetheless respects. Thinks that all his supporters are dopes, and he’s just pulling a fast one? Seems unlikely, as those folks are his affinity group.

    Yet it seems obvious to me that his views on evolution don’t explain abundant evidence. How do I wrap my head around his belief that Darwin was inspired by the devil as he developed his theory of evolution?

    I don’t want to put too much weight on one person’s research findings, or on the validity of tests of scientific knowledge, but I find it pretty damn interesting that Kahan finds that views on evolution don’t correlate with science literacy.

    I’m not entirely comfortable with saying that identity-oriented bias is a complete explanation for Carson’s views on evolution, but I’m probably less uncomfortable with that explanation than any others I’ve run across.

  18. Joshua says:

    It’s always funny to read comments after I’ve posted them to find the creative ways I’ve managed to misspell something, jumble syntax, or screw up the formatting.

    That should have started with:

    ==> “You seem comfortable describing potentially smart people who believe in creationism as being biased”

    Followed by: “How do we explain the beliefs of someone like Ben Carson?” w/o italics.

    Sheece.

  19. Joshua,
    I worry that maybe we’re taking my bullet points a bit too seriously 🙂

    However, if you’re suggesting that intelligent people can hold views that are hard to justify, then I agree? I think that is probably quite common. Doing so doesn’t make them stupid. However, if you’re going to challenge others (as Keenan is apparently trying to do) then doing so in a way that is particularly silly probably falls into some kind of different category. I think there’s a difference between someone who chooses to hold views that are probably wrong, but that are motivated by their biases, and someone going “aha, I’m so clever, I’ve come up with a challenge that will show that everyone who disagrees with me is wrong”.

  20. Is Doug Keenan real?

    His obsessive belief in statistics in isolation of models of the world (physical, biological, sociological, or whatever), shows him to be a mediocre practitioner of statistics, even if he is a master of the mathematical techniques. Rather like a physicists who can solve a 2nd order partial differential equation, but given a specific physical problem has simply no clue how to setup the equation that needs to be solved.

    Goldman Sachs would far rather employ a top physicist than a run of the mill statistician e.g.
    https://www.aei.org/publication/brain-drain-to-wall-street-goldman-sachs-hires-god-particle-physicist/

    Context free statistics has no value. The choice of statistical method is intimately related to the anatomy of the problem at hand. Even a non-statistician like me gets that. Doug doesn’t, apparently.

    As the MetOffice has said in response to Doug Keegans pestering (which has cost the tax payer god knows what), they have said:

    “Our judgment that changes in temperature since 1850 are driven by human activity is based on information not just from the global temperature trend, or statistics, but also our knowledge of the way that the climate system works, how it responds to global fossil fuel emissions and observations of a wide range of other indicators, such as sea ice, glacier mass, sea level rise, etc.

    Using statistical tests in the absence of this other information is inappropriate, particularly when it is not possible to know, definitively, which is the most appropriate statistical model to use. In particular, a key test of an appropriate statistical model is that it agrees with everything we know about the system. Neither of the models discussed by Mr Keenan is adequate in this regard. On that basis, this conversation on statistical modelling is of little scientific merit.”

    http://blog.metoffice.gov.uk/tag/doug-keenan/

    This appears to be attention seeking and narcissism by Doug Keenan that deserves to be ignored. He is mostly being ignored. So ATTP, I ask:

    Should we feed the trolls?! Toss a coin to find the answer.

  21. Richard,

    So ATTP, I ask:

    Should we feed the trolls?! Toss a coin to find the answer.

    Possibly not, but sometimes it’s good to remind myself not to take this too seriously. It’s just silly blog stuff.

  22. izen says:

    @-ATTP
    ” I think there’s a difference between someone who chooses to hold views that are probably wrong, but that are motivated by their biases, and someone going “aha, I’m so clever, I’ve come up with a challenge that will show that everyone who disagrees with me is wrong”.

    I wonder if we CAN “chooses to hold views that are probably wrong,”. Our biases invoke a conviction our views are correct.

    That is what may motivate the behavior of thinking they have found a way show some, not everyone, perhaps only their fellow view-holders, that everyone who disagrees with them is wrong.

    I suspect what might be called the Wegman gambit, except it turns out he borrowed it from someone else…

    Select a random generator and seed that produces a disproportionate number of sequences with a strong trend.
    Apply a autocorrelation model that just happens to emphasise the trends.
    Add lots of +/- trends of similar scale to the inherent variations and random trends so that many/most of the sequences appear to have a trend.
    With a bit of Darwinian selection you end up with enough potential false positives and negatives in the carefully constructed set of sequences that the prize is unwinnable. Or at least a 1/10,000 chance if he is not going to lose money.
    And you can declare that scientists are unable to tell a trend from a random sequence matching global temperatures.

    However that is biased by serious D-K about the statistical possibilities of all this and a cynical nature.
    -grin-

  23. Eli Rabett says:

    Keenan’s nonsense about time series analysts pretty much labels him as a chartist. Chartists are the homeopaths of stock trading. They do well in a one way market or in a bubble until it bursts.

  24. ATTP


    “Possibly not, but sometimes it’s good to remind myself not to take this too seriously. It’s just silly blog stuff.”

    I found a more serious subject than Doug Keenan’s nonsense …

  25. Why don’t you guys just pay £10 to win £100,000? You don’t need to accept that the challenge has any bearing on climate change — it has not — but it is a great opportunity to make £99,990.

  26. Magma says:

    If viewed solely as a mathematical/statistical challenge to correctly identify 90% of trended or trendless series with a $100,000 payout (for the first correct entry) for a $10 entry.

    Even this would hinge on trust that the game isn’t rigged and that Keenan would honor payouts. His instructions to email entries to him along with a $10 payment, payment method unstated, isn’t that assuring. Placing $100,000 in trust with a third party who would evaluate the entries would at least go partway to verifying his intent, if not the relevance of his contest.

  27. Keenan hasn’t even specifically defined the currency.

  28. izen says:

    @-“Why don’t you guys just pay £10 to win £100,000?”

    Academia has no chance, it will be won by those highly paid Time-series analysts in finance.

  29. Richard Tol – why don’t you? After all you have to fund your (accepted) adaptation.

  30. Here’s how we set up a real bet, RichardT:

    A few people have asked about the details of the bet I’ve arranged with Bashkirtsev and Mashnich. Both Jim Giles (Nature) and David Adam (Guardian) basically got it right, but here it is again in full.

    We will compare global surface temperatures in 1998-2003 with those in 2012-2017 (6 year average in both cases), using the USA National Climatic Data Center data (which can currently be found here for annual and here for monthly analyses). (Update 13 Feb 2007: data seems to have moved to here.) (Update Jan 2010 now it’s here)
    If the temperature rises over this interval, B+M will pay me $10,000 (in total). If the temperature drops, I will pay them $10,000 (again, in total).

    It really is as simple as that.

    http://julesandjames.blogspot.com/2005/08/bet.html

    Hiding a burden of proof reversal under a so-called “bet” is a bit Gremlin-like.

    When will you or dear Douglas make a similar bet with James?

  31. snarkrates says:

    I think we need a new definition for stupid. After all, if one lacks intellect, we already have many nouns–idiot, moron, imbecile, lackwit, and my personal favorite: ignorant foodtube–and many adjectives–unintelligent, etc. However, when someone actively uses their intellect to fool themselves…now that is stupid. And this reflects what we’ve long known–that the more intelligent a person is, the stupider he/she can be.

    Doug Keenan may be a clever manipulator of statistical recipes, but his utter insistence that he is smarter than the entire community of folks who’ve dedicated their life to understanding Earth’s climate is what makes him stupid. If there is one thing I have learned it is this: If you think you are the smartest person around, you need to broaden your circle.

  32. Apparently Richard Tol is a concerned and bashful socialist. Unwilling to feather his nest with £100K (approximately, allegedly) when so many suffer, and unwilling to show off his genius that no one can deny, he leaves the field open to the poor unwashed masses who need the money.

    What a great man is he.

  33. This could have been my entire post

    Doug Keenan may be a clever manipulator of statistical recipes, but his utter insistence that he is smarter than the entire community of folks who’ve dedicated their life to understanding Earth’s climate is what makes him stupid.

  34. BBD says:

    snarkrates wrote:

    If there is one thing I have learned it is this: If you think you are the smartest person around, you need to broaden your circle.

    This is something I find particularly painful about the climate ‘debate’ and the internet. There it is: a broader circle.

  35. Eli Rabett says:

    One of the things that growing up in the real world teach you, is as Damon Runyon put it

    “Son, you are now going out into the wide, wide, world to make your own way, and it is a very good thing to do, as there are no more opportunities for you in this burg. I am only sorry that I am not able to bankroll you to a very large start, but not having any potatoes to give you, I am now going to stake you to some very valuable advice, which I personally collect in my years of experience around and about, and I hope and trust you will always bear this advice in mind. ‘Son, no matter how far you travel, or how smart you get always remember this: Some day, somewhere, a guy is going to come to you and show you a nice brand-new deck of cards on which the seal is never broken, and this guy is going to offer to bet you that the jack of spades will jump out of this deck and squirt cider in your ear. But son, do not bet him, for as sure as you do you are going to get an ear full of cider.'”

    Never bet on a proposition.

  36. Magma says:

    …and Then There’s Gremlins

    Richard Tol (Nov. 20): Why don’t you guys just pay £10 to win £100,000? You don’t need to accept that the challenge has any bearing on climate change — it has not — but it is a great opportunity to make £99,990.

    Doug Keenan: A prize of $100 000 (one hundred thousand U.S. dollars) will be awarded to the first person, or group of people, who correctly identifies at least 900 series: which series were generated by a trendless process and which were generated by a trending process.

    Richard Tol (Nov. 18): Your null hypothesis should therefore be that I do not make elementary errors.

    The real Dr. Richard Tol must be quite ticked off at this imposter’s repeated attempts to make him appear a buffoon.

  37. John Randall says:

    In a 2011 WSJ article, Kennan sees similarities between the outcomes of coin tosses and rolls of dice on the one hand and NASA’s surface-temperature history on the other hand. The article reminds me another article, “Climate Deniers Are Giving Us Skeptics a Bad Name,” by Fred Singer at http://www.independent.org/newsroom/article.asp?id=3263. Referring to “Doug Keenan knows exactly what he’s doing, but thinks that all the climate science deniers to whom he’s trying to appeal are simply particularly dense:” I wonder if Singer thinks Keenan is dense.

  38. And then there are audits:

    Coca-Cola Co. said it has been notified by the Internal Revenue Service that it owes $3.3 billion in federal income taxes, plus interest, after an audit found the company’s reported income from 2007 to 2009 should have been higher.

    http://www.marketwatch.com/story/coca-cola-owes-33-billion-in-taxes-over-transfer-pricing-2015-09-18

    That’s a lot of 100k bets.

  39. Tom Curtis says:

    Just for fun, I downloaded Keenan’s series and tested them for a linear trend. The trends ranged from -300.1 to 288.58 units per century (mean: -2.55, standard deviation: 75.227). Binned by 10 unit/century groupings, the trend distribution is like this:

    In that context, Keenan’s challenge is that we pick those, and only those series that differ by a tenth of a bin category from amongst all the others.

    Much of the discussion above appears to be premised on the idea that Keenan at least had the honesty to use a random walk process that mirrored that of actual temperature series. No such honesty is to be found in this challenge, however. The magnitude of the pseudo-trends shows Keenan does not hesitate to rort his own test.

    To put this in terms that the mathematically challenged can understand, Keenan’s purportedly “trendless” process generates series with a mean trend of 55.5 units per century for the absolute magnitude of the trends. It only counts as a trendless process mathematically because averaged across all series generated, the positive trends cancel out the negative trends so that the mean of all series is close to zero. That it is generated by a trendless process says nothing about the trends of the individual series which are typically more than 50 times the magnitude of the trend he adds to an unspecified number of the series, and which are purportedly supposed to be tested.

    Richard Tol’s treating of this as a genuine test proves (yet again) what a buffoon he has become.

  40. Magma says:

    @ Tom Curtis: I think you did something wrong. Calculating a simple linear trend with the ordinate values running 1:135 I obtained slopes ranging from -0.89 to 0.45. While I am not the trusting type, I don’t think Keenan would have been that transparent.

    I don’t know how to embed images in posts so I uploaded a histogram of the trends to Imgur.
    http://imgur.com/5blYlX4

  41. Magma says:

    Oh. So that’s how.

  42. Tom Curtis says:

    Magma, thankyou. Rechecking my spreadsheet I apparently failed to anchor the X values in the trend calculation, with the result that I was taking the trend of each series relative to the preceding series. Correcting for that, I get the following histogram (mean 0.03, st dev : 0.75):

    That clearly differs from your histogram. However, the maximum individual value in the array is 2.45 and the minimum -3.076. Further the trend of the maximum values is 1.6 (minimum values:-1.6). These check results seem more consistent with my new histogram than your old one:

    I will reserve further comment until you have double checked so that I can be confident my base statistics are correct.

  43. Lars Karlsson says:

    Somebody should double check Tom’s computations. However, the histogram makes sense. There seems to be three distinct peaks: around -1, 1 and 0. The two former would be for the distribution of the time series with added trend, and the latter for those without added trend. I suspect that the overlap of the distributions is too large to achieve a 90% accurate classification, though.

  44. I downloaded the data and looked at a few. I certainly found trends of 1C/century and bigger, so that seems more consistent with Tom’s than Magma’s – unless Magma’s is somehow using different units.

  45. guys — maybe I have submitted my solution already, maybe I’m still checking the final details

    what an opportunity: win $99990, make Keenan poorer, and pip Tol to the post

  46. Sam taylor says:

    His previous comment about the relative skills of those in academia and finance only serves to indicate that he’s as ignorant of economics as he is of geophysics. Wages are in no way a measure of relative skill gaps between professions, they are better thought of as the incentive required to get people to work in that profession. Plenty of undergraduates want to go on to become researchers because they enjoy the work, so the work itself is the incentive and due to the high competition pay tends not to be as high. Meanwhile most of my friends in finance went there specifically because of the money, because nobody ever grew up wanting to be a financial time series analyst and thus they need a bigger carrot to be tempted into doing so.

  47. Lars Karlsson says:

    Considering Tom’s histogram, trends lesser than -0.7 or larger than 0.7 should have a better than 0.5 chance of having a trend added, and the opposite for those inside the -0.7 to 0.7 interval. The further away from those limits, the better the odds. The question is: would that suffice to have a better than 1/10000 chance to guess right in at least 900 out of 1000 cases? I suppose one could estimate that if one could first estimate the separate distributions. That may also give a better indication of where the limits should be.

  48. dikranmarsupial says:

    RichardTol wrote “Why don’t you guys just pay £10 to win £100,000?

    Because we are not that naive. It is likely that the challenge is impossible by statistical means (possibly by design) and there is no chance of anyone winning £100,000. I’m surprised that you are still making this argument given that you conceded that the usual statistical tests have low statistical power (which means that getting 900 out of 1000 right is very unlikely).

    As I said, the was to show statistical expertise in this case is to notice that the challenge is ill posed, and probably impossible and devote energies to more interesting challenges, such as those on Kaggle.

    “You don’t need to accept that the challenge has any bearing on climate change — it has not”

    Utter nonsense, if you think that Keenan won’t use the outcome to further his claims about climate then you are rather naive.

    ” but it is a great opportunity to make £99,990.”

    No, it is a great opportunity to throw £10 away in order to help promulgate misunderstandings of both statistics and climate.

  49. No, it is a great opportunity to throw £10 away in order to help promulgate misunderstandings of both statistics and climate.

    Indeed, and my current null hypothesis is that Richard does make elementary errors and that his underlying goal is to promulgate misunderstandings of both statistics and climate.

  50. dikranmarsupial says:

    Lars, if I have calculated it correctly, you would need about an 85% chance of getting an individual classification correct (assuming independent Bernoulli trials) in order to have a better than 1/10000 chance to guess right in at least 900 out of 1000 cases. This is why the usual tests having low statistical power is a problem as you need to be confident in your ability to reject the null hypothesis when it is actually wrong in a test to be confident of getting the answer right, as well as not rejecting the null when it is true.

    Now if Keenan provided enough information to show that the challenge was technically feasible, then it might be worth considering. It is worth looking at the papers published describing the design of existing open challenges in statistics and machine learning to see how much work goes into ensuring that the challenge is meaningful and useful. I can’t see any sign of that in this case.

  51. Tom Curtis says:

    Lars, from Keenan’s description:
    1) We do not know the autoregressive process used to generate the random series, still less parameters used, but
    2) For series with added trends, the specified trend is said to be 1 unit per century. As the +/- 2 seems to be the limit of composite trends, that means +/-1 is approximately the limit of the original series. From this we can infer that the 12.5% of series in the text file with absolute values of trends greater than, or equal to 1.2 are all composite trends (22.4% for cut of of 1).
    However,
    3) for approximately 50% of series with added trend, the trend will be of opposite sign to that generated by initial random process. Therefore a high proportion (approx 50%) of series with added trends will have composite trends between -1 and 1. I doubt these will be distinguishable from the series with added trend added.

    To complicate matters, we do not know what portion of initially generated series had a trend added, although from considerations in point (2), it must be between 25 and 50%. Further, not all add trends are deterministic, and there is no reason to think the added deterministic trends are best described by a linear trend. Only that their linear trend will be -1 or 1.

  52. dikranmarsupial says:

    Tom, it is worth noting that if you need to look at the population of time series in order to win the challenge, that immediately means that it the challenge tells you nothing about what statistics can tell you about the actual climate, for which we have only a single time-series. The challenge really ought to specify that the contestants provide an algorithm for determining whether there is a trend or not, which produces the stated set of predictions for the 1,000 time series used in the challenge.

    “From this we can infer that the 12.5% of series in the text file with absolute values of trends greater than, or equal to 1.2 are all composite trends (22.4% for cut of of 1).”

    I’m not sure we can assume this as it is not clear that the random process is the same for all time-series.

    “Some trends were deterministic; some were not. Each individual trend was 1°C/century (in magnitude)—which is greater than the trend claimed for global temperatures. ”

    Not sure exactly what this means.

  53. Your understatement to qualify that bet as stupid shows a very impressive self control. Times series correlations are poor physics if no causality. What is very positive is the fact that deniers, contrarians, and skeptics now change very often their arguments and action ways behaving like future losers.

  54. Why doesn’t Richard Tol just stay silent and then let everyone know when he’s taken delivery of the $100,000? I, for one, will then be quite impressed.

  55. Magma says:

    @ Tom Curtis: Oh, the joys of back and forth peer review, especially for work done quickly and late at night. I forgot that the trend was per century, so when I looked at the data before plotting I selected the column with the larger numbers returned by polyfit, which was the p0 coefficient, not the p1 one.

    Here is my corrected version of the histogram which matches your second (peer review works!), where I’ve drawn some simple curves to approximate a Gaussian distribution for the positive, negative and untrended cases. The trick would be to identify which of the ~200 cases that fit in the overlapping regions are which while misclassifying no more than 100 of them. So taking a naive look at the challenge, it appears solvable. But the devil is in the details, and depends very much on the spread that Keenan added to his data. If I thought this was a legitimate contest I would fit the slopes with three Gaussian curves centered at -1, 0 and 1, use the fitted values for the width and amplitude to identify the most probable distribution ranges of the ambiguous cases, and then either carry out further analysis on those or select them according to visual appearance (what old geophysicists and engineers referred to as the Mark 1 Eyeball test).

    Maybe it’s worth $10. I’ll think about it.

  56. izen says:

    Those of us essentially inumerate have to simplify.

    Ignoring the detail of the sequences with its unknown provenance, and rounding out just the final values gives a list of 1000 numbers between -1, zero and +1. The original distribution of these values is unknown.

    Keenan claims to have altered an unknown number of these values by +/-1, the challenge is to correctly identify at least 9 out of 10 of all those he has altered.

    That is of course impossible.

    The implication is that the additional information in the sequences and distribution of fractional values can be used to derive the original distribution and the subsequent alterations by some statistical means. I suspect a Bayesian analysis would indicate that is inherently unachievable.

    And the failure is inevitably declared evidence that scientists cannot tell a climate trend from a random walk.

  57. That is of course impossible.

    Yes, it seems that way. It is interesting, though, that it seems fairly straightforward to establish that there are 3 populations – mean trend 0, mean trend +1, mean trend -1.

    And the failure is inevitably declared evidence that scientists cannot tell a climate trend from a random walk.

    Of course. It’s probably why my conclusions are almost certainly wrong.

  58. Magma says:

    Using Tom O’Haver’s peakfit function for MATLAB and appropriately binned slope data, I estimate that the simplest approach of splitting the three distributions where their tails intersect would result in ~120 misclassified cases, and a losing entry.

    If the ~250 cases in the regions of maximum overlap can be better classified using more sophisticated than this brief screening method, then the odds may improve. As it is, it looks as if Keenan may have carefully stacked the deck in his favor.

  59. dikranmarsupial says:

    Note it is possible that there are three different types of noise process as well, so the apparent clusters don’t necessarily correlate with the existence of a trend. For example the program could randomly generate some time-series with stationary noise and some with non-stationary noise.

  60. Those three gaussians look like three shells in a shell game. And isn’t there usually an accomplice pretending to be independent who tells everyone they could win a lot of money if they played the shell game? Hmm…

  61. Magma says:

    @ dikranmarsupial: I’m assuming a certain minimal level of good faith on Keenan’s part, even if he has failed to describe many aspects of his challenge clearly enough.

  62. As far as I can see, Doug Keenan has never, ever, paid any attention to the physical scientific evidence for climate change. Indeed, rather than revealing ‘the truth’, he seems to be blinded by statistics. http://www.informath.org/media/a42.htm

  63. dikranmarsupial says:

    @magma ISTR his argument is about random walks, so I would be surprised if all the noise processes used were stationary.

  64. Magma says:

    First, 1000 random series were obtained (via trendless statistical models fit for global temperatures). Then, some randomly-chosen series had a trend added to them. Some trends were positive; others were negative. Some trends were deterministic; some were not. Keenan

    Yes, that muddies the water a bit more.

  65. Can someone actually explain what Keenan means by the above. As I understand it, a deterministic trend would be something like

    y_t = \alpha t + \epsilon_t,

    while a stochastic trend would be something like

    y_t = \alpha + y_{t-1} + \epsilon_t.

    So, what does he mean by some were deterministic, some were not?

  66. izen says:

    @-“So, what does he mean by some were deterministic, some were not?”

    That he has rigged the pattern of the trends to cause the most false positive and negative results for any statistical test intended to separate random noise from random noise plus trend.

    The idea that a temperature record could have a non-deterministic trend seems odd.

  67. izen,

    That he has rigged the pattern of the trends to cause the most false positive and negative results for any statistical test intended to separate random noise from random noise plus trend.

    That was my impression too.

  68. To follow up on the issue of deterministic and non-deterministic trends, does what Keenan has said imply that his underlying random series are not necessarily all generated via the same process?

  69. Magma says:

    @ ATTP: it’s unclear, probably deliberately so. From Keenan’s challenge page: “First, 1000 random series were obtained (via trendless statistical models fit for global temperatures)”

    trendless statistical models — note the plural. Is the noise Gaussian, or did he use a heavier-tailed distribution? Did he mix distributions?

    I may look at the first differences to see if I can work things out a bit better. I’m hesitant to spend much time or effort on a contest that may be rigged to fail, but on the other hand even a rigged contest may be beaten if one can figure out how the numbers were calculated. That Keenan says he has provided the encrypted answers in advance makes it harder to move the goalposts after the fact.

  70. Gator says:

    Presumably one could enter 100 times and still make money. If you have a way of being fairly certain about most of the time series, and you only need to get within 90% right.

  71. izen says:

    @-Gator
    “Presumably one could enter 100 times and still make money.”

    Improbable. It looks as though there about half of the sequences with no trend and half with a trend. But it is unlikely that there are no misleading cases with apparent large or trends that are actually the result of the random sequence generation via the seed choice and the trendless statistical models. And near-zero trends which are the result of trends negating the random/statistical sequence drift.
    That still leaves around 200 sequences with an ambiguous trend. The possible combinations of altered/unaltered of that 200, of which you need to get 180 correct, are rather large.

    Entering 100 guesses would increase your chance of winning somewhat less than buying a 100 lottery tickets in a 6 number game.

  72. izen says:

    @-Magma
    “trendless statistical models — note the plural.”

    Yes. One possible chink in the ‘game’ design would be if the autocorrelation pattern imposed on the random sequence to fit global temperatures left a detectable signature. If that was broken by the addition of a trend then altered sequences could be identified.

    However the implication that multiple statistical models were used to modify the random numbers AND the trends are deterministic and nondeterministic(?) indicates Keenan has minimised that possibility. He would have to be dense or silly not to!

  73. Tom Curtis says:

    Magma:

    1) I would not spend any serious time on this unless Keenan first places the prize in trust releasable by an independent adjudicator. Absent that step, I see no reason to think Keenan would pay out, or even acknowledge that an entry satisfied the challenge.

    2) For the sake of argument, assume your three peaks do represent series generated by the trendless processes for the central peak, and series with an added (or subtracted) linear trend. In that case, the fact that the two side peaks have a greater width shows that some of the added trends (at least) include an error term. This may generate a statistical difference that would allow a way to tease apart those series with an added trend in the intermediate area – but I am not hopeful.

  74. Magma says:

    My final update on Keenan contest
    A long post, but I’ve followed this item more closely than I normally would for several reasons. I occasionally work on inverse problems, and this one seemed fairly interesting, with the additional lure of a $100,000 prize (but remember, lures are for fish and game). In addition I’ve long been interested in the sociology of global warming deniers and climate skeptics and how few of them have mastered even the basic technical tools needed to approach the field; Keenan seemed like he might be a partial exception.

    Contest details
    Following an email inquiry, the first successful entry as time-stamped by email wins. Entrants are sent a Paypal link for the entry fee. As to my question as to whether the prize funds had been placed in escrow or if it was based on the honor system Keenan replied: “It is on the honor system. My name would be mud if I reneged. My web site tells a little about me: I used to be a financial trader and stopped working in 1995; so I am economically independent.”

    Some mathematics
    The original time series closely but not perfectly resembled a random walk with added Gaussian noise (standard deviation 0.11), with ~200 of the series having a -0.01/step trend added, and ~200 having a +0.01/step trend added. This led to a distribution of the slopes of the time series showing a central pseudo-Gaussian peak at 0 with two smaller satellite peaks centered at -1 and +1, with something on the order of 250 individual series likely to be in an ambiguous area where the distributions of the slopes overlap.

    So making the best-case assumption than unbiased Gaussian distributions are involved rather than a heavier-tailed distribution or a biased selection from a larger group od randomly-generated series, a winning entry would hinge on 1) developing a statistical method to distinguish a trended from an untrended random walk where the slopes are identical, which I do not believe is possible, or 2) classifying time series from the two areas of overlap by chance (i.e., guessing), in the hope of getting no more than 100 wrong.

    So let’s look at that strategy, calculating cumulative binomial probability for a relevant range of cases falling in the overlap areas (my estimate of the likely number of cases bolded):
    #correct/#total probability
    100/200 52.8%
    110/210 26.2%
    120/220 10.0%
    130/230 2.78%
    140/240 0.58%
    150/250 0.094%
    160/260 0.012%

    So assuming a not-too-rigged contest, someone adopting that strategy and submitting several hundred to a couple of thousand different entries could potentially win… at the risk of losing a $2000 to $20,000 ‘investment’ if the mathematical challenge is deliberately biased, if an earlier entry wins, or if Keenan refuses to honor the contest.

    <Moving the goalposts
    However careful readers may notice I wrote “the original time series”. Keenan replaced the original one today, stating that “a few people pointed out to me that the PRNG I had used might not be good enough. In particular, it might be possible for researchers to win the Contest by exploiting weaknesses in the PRNG” and “Ergo, I regenerated the 1000 series using a stronger PRNG, together with some related changes.”

    Coincidentally enough, the “related changes” merge the three distributions more closely, to the point that they are essentially unresolvable. Rather than some far-fetched claim that someone would “exploit weaknesses” in a modern pseudorandom number generator, I strongly suspect Keenan realized the financial risk that even his ‘safe’ challenge exposed him to… who knows, maybe he even visited this website.

    Needless to say I will not be entering the contest, which, of course, was never particularly relevant to climate studies.

  75. Magma says:

    Having now retired from the fray, I look forward to reading the peer-reviewed article in which Prof. Tol explains the statistical analytical method(s) he developed to win the $100,000 Keenan Contest.

  76. Magma,
    I noticed that he had changed the 1000 series, but hadn’t appreciated what impact the change has had on the distribution. I presume the “some related changes” suggests he made some changes that make it harder to actually solve.

  77. BBD says:

    Oho! Cheeky.

  78. Sure, if you discover that your challenge that is meant to show that everyone else is an idiot is actually solveable, wouldn’t you change it too?

  79. verytallguy says:

    Shock as denier proves incapable of rigging statistics to prove own cleverness; moves goalposts and pretends nothing has changed.

    Whouda thunk?

  80. vtg,
    Took me by surprise too 😉

  81. Lars Karlsson says:

    So it turned out that Keenan’s challenge wasn’t stupid enough, and he had to change it.

    “Stupid is as stupid does.”

  82. Lars Karlsson says:

    Yes, it is clear that Keenan, while fixing the alleged random number generation problem, also took the opportunity to flatten the distributions a bit and make the overlap larger.

  83. Ethan Allen says:

    Data manipulation. Where’s Lamar Smith when you need him?

  84. Ethan Allen says:

    DK also dropped one significant digit from the updated time series. S/N ratio goes down accordingly!.

    So the real story here is, Denier Kook commits mass genocide on pseudo data time series.

  85. Michael Hauber says:

    There is one way that global temperature could be a random walk. This is if positive feedbacks are strong enough to balance the plank response. Then any random variation would move climate to a new state and there would be no tendancy for the plank response to push the climate back towards equilibrium. This would imply that climate sensitivity is infinite. Or if climate sensitivity is very large then we could be close enough to a random walk to not be able to tell the difference.

    In laymans terms, if the climate tends to warm or cool a lott if it has no reason to move, its going to warm or cool a lot more if we give it a reason to move…..

  86. dikranmarsupial says:

    “a few people pointed out to me that the PRNG I had used might not be good enough. In particular, it might be possible for researchers to win the Contest by exploiting weaknesses in the PRNG”

    This seems unlikely to me – how could the PRNG cause the three clusters in the original histogram that are not there in the revised one? Seems a bit fishy to me. Keenan should be required to publish the source code for the original dataset as well as the revised one, at the end of the challenge, to show that it was merely a PRNG issue.

  87. Lars Karlsson says:

    dikranmarsupial,
    Yes, it would be a remarkable coincidence if the disappearance of the three clusters visible in the original distribution were due to the changed PRNG. It seems more plausible that some parameters have been tweaked.

  88. Eli Rabett says:

    WATLWYNT (Where are the lawyers when you need them). Changing the rules in the middle of the game is prime Climateball. It also might subject Keenan to a legal challenge.

  89. dikranmarsupial says:

    I doubt anybody has done so, but one wonder what would happen if someone had submitted a winning entry prior to the change of dataset? ;o)

  90. Willard says:

    Unless the Keenan has hired lawyers to oversee and (gasp!) audit the overall process, I doubt it could be more than a PR stunt.

    Usually, bets can be settled by evidence obvious enough no-one can dispute it.

  91. 0^0 says:

    Change of goal posts in the middle of the game elevates Keenan’s statement

    “It is on the honor system. My name would be mud if I reneged…”

    To an interesting level.. How do you define “renege” in this context?

  92. My name would be mud

    I’m not sure why he thinks it isn’t already.

  93. buddyglass says:

    So this challenge prompted me to learn the rudiments of R. One happy side effect. I’m not finished yet, but I believe the approach I’m taking will allow me to estimate:

    1. The # of series that were modified (call this N).
    2. The probability that a given series was modified.
    3. From #2, the N series most likely to have been modified.
    4. From #2 and #3, the expected accuracy of a complete submission in which I label the N most likely modified series as “modified” and all the rest “not modified”.

    I very much suspect my “expected accuracy” will fall well below the required 90%. By design.

  94. I very much suspect my “expected accuracy” will fall well below the required 90%. By design.

    I very much suspect that you’re correct.

  95. buddyglass says:

    Questions for Magma & Tom Curtis, since you guys seem to have thought about this at a pretty deep level:

    Most of what I’ve seen so far deals with the slopes generated by doing a linear fit for each series. That makes sense. But what about using the mean instead, applied to some “tail” portion of each series (since the “tail” should show more pronounced evidence of a trend, if there is one)?

    For a series that has been modified in the “upward” direction we should expect the mean of, say, the last 20 values, to be (1.34 – 1.14)/2 higher (+/- noise) compared to the mean of the last 20 values from the “average” unmodified series. So a series whose mean-last-N is an much higher (or lower) than “normal” is presumably more likely to have had a trend applied.

    And we can expect the distribution of mean-last-N to have the same form as the distribution of the slopes. That is, a central normal distribution for the unmodified series mixed with two “shifted” normal distributions centered at +/- the delta values (which depend on the “N” in “last-N”).

    Might it be fruitful to *combine* a mean-last-N analysis with the slope analysis in order to more accurately identify the modified series?

    On a completely different note…since you’ve said you suspect the series are generated via random walk (which I assume means that each value is generated by applying a delta to the previous value), then would it make sense to look at the deltas and not the series themselves? That is, instead of analyzing x1, x2, x3, …, look at (x2-x1), (x3-x2), (x4-x3), …?

  96. Magma says:

    @buddyglass: I considered your last paragraph (the deltas or what I referred to as first differences) but only briefly. In a reply to a question on WUWT (or maybe another site) Keenan stated that the added +/-1°C per century was not necessarily linear, i.e. it could be stepped or change signs within the overall trend.

    I don’t think your local mean idea will work, but frankly haven’t given the matter any more thought since I dismissed the contest as fixed and unsolvable a few days ago. Even many of the Wattites smelled a rat, which is saying something.

  97. buddyglass says:

    Even if the trend adjustments aren’t applied evenly they should still show up in a mean of first differences, right? I will most likely enter the max number of times Keenan allows (which seems to be ~3 per his comments in the contest notes). I figure $30 is worth the potential notoriety winning would bring. Maybe it would allow me to switch careers. (From straight-up “software” to something more “data sciency”).

  98. buddy,
    I not convinced that Keenan would pay out, but if he does, I assume you’ll be rewarding those who helped you with this endeavour 🙂

  99. buddyglass says:

    Sure. If I win, $10k to the blog, $5k to Tom Curtis, $5k to magma. Or my name is mud. 🙂

  100. RB Wiley says:

    Keenan’s opinion that financial time-series analysts [read: ‘economists’] are the best was profoundly ‘smacked down’ by Nate Silver (The Signal and the Noise: Why So Many Predictions Fail – But Some Don’t, 1st Edition).

  101. dikranmarsupial says:

    Excellent post on this from Andrew Gelman (who has obviously read the discussion here at some point ;o), which pretty much validates what was said here about the challenge being un-winnable (even without considering random walk noise).

  102. Pingback: 2015 in review | …and Then There's Physics

  103. Magma says:

    To my surprise I see (h/t to Michael Mann on Twitter) that McGill’s Shaun Lovejoy and several colleagues have taken on this problem in much more detail and rigor than was done here, although their conclusions are similar. I’m not at all sure that Keenan’s challenge was worth the work put into the reply by Lovejoy et al., but of course that was a decision for the authors.

    Huffington Post: The $100,000 Giant Climate Fluctuation

    Lovejoy et al. (2016) GRL preprint: Giant natural fluctuation models and anthropogenic warming

    Supporting information for Lovejoy et al. (2016) Giant natural fluctuation models and anthropogenic warming

  104. Magma,
    Thanks, I hadn’t seen that. Very clever analysis – assuming I understand it. Basically, if Keenan’s timeseries are representations of Giant Natural Fluctuations that have produced the observed temperature change since the mid-1800s, then we’d expect ice ages every ~1000 years.

  105. Magma says:

    Lovejoy doesn’t write easy papers you just can just sail through, to be sure. Although he has presented some remarkably intuitive graphs, for example this one from a short discussion in EOS: https://eos.org/opinions/climate-closure/attachment/presentation9

  106. Yes, they’re certainly not easy. However, what did strike me was that despite all of the claims that climate science needs more independent statisticians (people like Keenan I imagine) what Lovejoy wrote seemed far more thorough and complete than anything I’ve seen from one of these supposedly expert statisticians who will step in to save climate science from all those scientists who supposedly don’t understand statistics.

  107. Keith McClary says:

    “Games of chance are those games whose outcome depends upon an element of chance, even though skill of the contestants may also be a factor influencing the outcome. A game that involve anything of monetary value, or upon which contestants may wager money is considered as gambling. Every game of chance involving money is a gamble. There are laws restricting or regulating the conduct of games of chance. Some games of chance may also involve a certain degree of skill. In some countries, chance games are illegal, or at least regulated, where skill games are not.”
    http://definitions.uslegal.com/g/games-of-chance/

  108. Pingback: La legge di Brandolini - Ocasapiens - Blog - Repubblica.it

  109. Magma says:

    And for the wrap-up, an excerpt from Lovejoy et al. (2016) Keenan’s Giant Natural Fluctuation model unveiled:
    As promised, on November 30th 2016, Keenan announced that none of the 33 contest submissions were winners and he unveiled the computer code that generated the 1000 series. His computer model was actually quite bizarre, it consisted in a random shuffling of each of 4 rather different submodels, one of which was actually an IPCC numerical model output! Two of the other three submodels were basically standard stochastic models with long range dependencies: an integrated ARMA model that gives standard Brownian motion at long time scales, and a fractional Brownian motion (fBm) model. The third submodel was a complicated homemade concoction but it also gives standard Brownian motion at long time scales. As predicted, all the submodels (and hence the overall model) had strong (power law) statistical dependencies.
    Things were actually even more complicated than this since Keenan spiced things up in a manner that is very difficult to theoretically analyze. In actual fact, he made 365 realizations of each submodel and for each (using a nontrivial “excision” procedure), the 32 realizations with the largest variability were thrown away. The resulting 3X333 “clipped” series were then added to the fourth submodel (the unique GCM output) to make the total up to 1000. The trends of the clipped submodels had a nonstandard – and nontrivial to analyze probability distribution.

    I doubt many ATTP readers will be surprised to learn it was a con all along.

  110. Magma says:

    (The preceding came out less legible than I intended.)

  111. Joshua says:

    ==> (The preceding came out less legible than I intended.) ==>

    Glad that never happened to me.

  112. Keenan has it on his site. His argument against Lovejoy’s paper is

    When doing statistical analysis, the first step is to choose a model of the process that generated the data. The IPCC did indeed choose a model. I have only claimed that the model used in the Contest is more realistic than the model chosen by the IPCC. Thus, if the Contest model is unrealistic (as it is), then the IPCC model is even more unrealistic. Hence, the IPCC model should not be used. Ergo, the statistical analyses in the IPCC Assessment Report are untenable, as the critique argues.

    Which just illustrates he’s clueless (or dishonest). The linear regression used by the IPCC is not a model; it’s a method for extracting information about the data, and only applies to the period over which we have data. It allows us to quantify the linear trend and the uncertainty, and to establish – in a formal sense – if we have warmed over some time period. Of course, Keenan can do a different type of analysis if he wishes, but I’ve yet to get him to explain what fitting some randomly generated data to the observations, tells us about the observations.

  113. Jaap says:

    He did explain that a long time ago. He also made an easy to follow summary of that in an article in the WSJ. Reprint here: http://www.informath.org/media/a42.htm

    The issue is that in order to claim that a time-series shows a statistical significant trend, one has to compare the observed trend with a reasonable model for the time series. To explain why he gives the example of comparing the same trend line with two models (coin toss and dice rolls) and for one model that same observed trend is significant and for the other it is not. He says the IPCC used an unrealistic model to compare with and when you do use a more realistic model the trend is not significant. This has been Keenan’s point all along, IMHO.

    Some quotes from above link:

    “The IPCC’s most-recent report on the scientific basis for global warming was published in 2007. Chapter 3 considers the global temperature series illustrated in Figure 1. The chapter’s principle conclusion is that the increase in global temperatures is extremely significant.
    To draw that conclusion, the IPCC had to make an assumption about the global temperature series. The assumption that it made is known as the “AR1” assumption (this is from the statistical concept of “first-order autoregression”). ”

    So the IPCC did choose a model that they used as the reference. It is a very simplistic and indeed quite unrealistic model. Also, the observed time series fails standard statistical checks whether it can be from such an AR1 model. And therefore saying that the observed data shows a significant trend when compared to an AR1 model is rather pointless.

    “Moreover, standard checks show that the global temperature series does not conform to the assumption made by the IPCC; one such check is discussed in a separate section below. Thus, the claim that the increase in global temperatures is significant—the principal conclusion of a major chapter of the IPCC report—was based on an assumption that is insupportable. More generally, the IPCC has failed to demonstrate that global temperatures are significantly increasing.”

    More details in: http://www.informath.org/media/a41/b8.pdf
    Some quotes:

    “A non-AR1 assumption dismissed by the IPCC chapter
    In 2005, two scientists at the U.S. Geological Survey, T.A. Cohn and H.F. Lins, published a research article that considers an assumption other than AR1. The article concludes that if the other assumption is valid, then the increase in global temperatures is not significant.”

    “What the Climate Change Science Program said about the AR1 assumption
    The CCSP report claims that AR1 is “an assumption that is a good approximation for most climate data”. The claim is given without any evidence, argumentation, or reference. In fact, methods for testing the claim — which demonstrate that the claim is false — are taught in introductory (undergraduate) courses in time series: for some textbooks, see the Bibliography.”

  114. Jaap,

    The issue is that in order to claim that a time-series shows a statistical significant trend, one has to compare the observed trend with a reasonable model for the time series. To explain why he gives the example of comparing the same trend line with two models (coin toss and dice rolls) and for one model that same observed trend is significant and for the other it is not. He says the IPCC used an unrealistic model to compare with and when you do use a more realistic model the trend is not significant. This has been Keenan’s point all along, IMHO.

    I know it’s been his point all along. It’s also been rubbish all along. Firstly, statistical significance is relative to a null; it doesn’t have to be relative to a model. For example, we can ask whether or not the data is consistent with no trend. He’s using significant to mean relative to some kind of random variation, not relative to whether or not there has been warming (there has).

    As far as his coin tosses and dice rolls, what he doesn’t acknowledge is that these are systems for which a simple random model is reasonable representation of the system. Therefore you can test whether or not your data is consistent with such a model to see if your data is from a fair coin/die, or not. Our climate is not random, therefore this is not an appropriate comparison. It also ignores that the point about fitting trends to the temperature data is not to determine if it’s consistent with a model, or not, it’s very simply to establish if there has been warming, or not.

  115. Jaap says:

    No indeed, climate is not completely random. But then neither are realistic underlying systems for other kinds of time-series. But they always contain randomness which is an intricate part of the underlying system.

    Look we know there has been warming since 1900 (or 1880). And we know there must have been a contribution to that from man made extra CO2 emission. The question is whether the observed trend is statistically significant compared to trends that such a system can produce at random.

    So yeah, he indeed is using significant to mean significant relative to some model (with a certain type of randomness, auto regression, trendiness, etc). that is *statistically* similar to the climate system. But IMHO that is correct. That is what significance means: significant deviation from what can be produced at random from a statistical similar model.

    In this case the question is whether the addition of something extra (man made extra CO2 which increases temperature) to the basic system (the normal climate system without that bit extra) does indeed have a significant impact. In order to be able to establish that you must compare the observed data (the normal climate system with that bit extra) with a reasonable statistical model of the climate system (the normal climate system without that bit extra).

    So I don’t see how it is valid to compare with true null (as in a flat line model or similar), instead you must compare with something more realistic. Realistic in the sense that at the very least the observed data must come from a model that can produce such a time series. That statistical model acts as the proper baseline (‘null’) you compare with in order to determine statistical significance.

    Compare this with financial time-series: we now know that they come from distributions that have fat-tails. That means that standard statistical tests for significance (which test versus the normal Gaussian assumption) are invalid. That is because the actual underlying system differs too much from normal (Gaussian) processes. E.g.: Normally (Gaussian) a 3-sigma or higher event is rare and a 5-sigma event is quite unlikely. But over the last 40 years we have seen many 5-sigma events (and worse) in financial time-series, impossible when assuming a normal Gaussian process. Only the last couple of decades or so do many/most practitioners in finance now realize that this is to be expected, simple because the underlying systems are not Gaussian, not even close.

  116. Jaap,

    The question is whether the observed trend is statistically significant compared to trends that such a system can produce at random.

    No, it’s not. This is the key point. OLS is not asking whether or not the trend is significant with respect to what could be produced at random. It is simply asking whether or not it has been warming. That is all. Keenan’s argument is simply a massive strawman.

    In order to be able to establish that you must compare the observed data (the normal climate system with that bit extra) with a reasonable statistical model of the climate system (the normal climate system without that bit extra).

    No, you need to compare it with a physical model of the climate. There is no statistical model that can, alone, answer this question. Even the bits that we regard as random require a physical model, not a statistical model (well, you could use statistical model, but it would have to be constrained by what we knew to be physically plausible).

    Compare this with financial time-series: we now know that they come from distributions that have fat-tails.

    Why? Financial time-series are largely irrelevant. Our climate is a physical system that obeys the laws of physics. Unless your model incorporates these laws, it can’t be used to understand what may, or may not, be causing the observed changes.

  117. I missed this bit

    So I don’t see how it is valid to compare with true null (as in a flat line model or similar), instead you must compare with something more realistic.

    You can do whatever you like. The point is that basic OLS (with uncertainties) is essentially determining a trend and an uncertainty in that trend. You can then ask the question as to whether or not this is statistically different from no trend. You could also ask whether or not it is statistically different from an even faster warming trend. If you want to compare it with something more realistic, that’s also fine. However, if you’re going to critique what others have done, you need to at least critique what they actually did, not make up something they didn’t do and then criticise that.

    The basic OLS analysis tells us that we’re warming. That is all. It doesn’t tell us why we’re warming, or anything about the underlying causes. That does indeed require comparison with actual models, but they need to be models that incorporate the laws of physics, not models that are simply statistical.

  118. Marco says:

    This:
    “…relative to some model (with a certain type of randomness, auto regression, trendiness, etc). that is *statistically* similar to the climate system.”
    is just plain inane. What Keenan IMHO has done is to first mine for a model that did something similar as the climate (read: surface temperature) has done (and thus to some extent happens to fit with the *physics*), and then declared this model the null.

  119. Jaap says:

    “The basic OLS analysis tells us that we’re warming. That is all. It doesn’t tell us why we’re warming, or anything about the underlying causes. ”

    Sure we’re warming. We do not need to do OLS or similar for that 🙂
    That’s not the issue, the issue is what cause the trend. If we can prove that the trend can’t happen by mere chance (i.e. not due to the climate model when left alone), then we have proof that it must be man made, most likely CO2.
    And yes we can, even without knowing the actual underlying physical system.

    “What Keenan IMHO has done is to first mine for a model that did something similar as the climate …”
    Yep.
    But that was actually the second thing. The first was that he debunked the AR(1) assumption.
    The short term climate data that he used (GISTEMP, 1881–2009) does not fit with an AR1 assumption.
    Thereafter he indeed searched (or mined) for something that did (he used ‘a driftless ARIMA(3,1,0) model’ in http://www.informath.org/media/a41/b8.pdf) that could fit.
    That doesn’t mean that it is the correct statistical model, just one that fitted. Based on AR1 the trend is significant, based on that ARIMA model it isn’t.
    That doesn’t mean that the assumption that the global warming trend is man made is disproved, it merely means that AR1 is an invalid assumption. There may be other assumptions that are both valid and still show a significant trend.

    Now read this bit (also quoted above AFAIK): http://www.physics.mcgill.ca/~gang/eprints/eprintLovejoy/neweprint/GRL54794.proof.only.SL.8.8.16.pdf
    They seem to get it. The only issue is that they are using that fake data set of the 1000 series, which is not only completely fabricated but also a needlessly complex mixture in order to hide the actual types of model used (for that challenge you know).
    But they do the correct check: after comparing with short term climate (ahum surface temp) they then checked whether it was valid longer term. They concluded it wasn’t because those 1000 series data exhibit too many extremes, resulting in ice ages every 1000 years (brrr). Clearly a bit too much.
    So that is the real challenge: find a statistical model (not AR(1) nor ARIMA(3,1,0)) that is valid for both the measured short term data set, yet remains feasible longer term. The simpler the better.

  120. Jaap,

    If we can prove that the trend can’t happen by mere chance (i.e. not due to the climate model when left alone), then we have proof that it must be man made, most likely CO2.
    And yes we can, even without knowing the actual underlying physical system.

    No, you can’t. Seriously, this is complete and utter nonsense. You cannot make any claim about the cause using only a statistical model. This is not even complicated.

    But that was actually the second thing. The first was that he debunked the AR(1) assumption.

    There isn’t really an AR(1) assumption. This simply relates to the correlation between the residuals when determining the uncertainty in the trend. There is no suggestion that AR(1) somehow represent the randomness of the timeseries.

    Thereafter he indeed searched (or mined) for something that did (he used ‘a driftless ARIMA(3,1,0) model’ in http://www.informath.org/media/a41/b8.pdf) that could fit.

    So what? Finding something that fits tells you absolutely nothing about the cause of the warming.

    So that is the real challenge: find a statistical model (not AR(1) nor ARIMA(3,1,0)) that is valid for both the measured short term data set, yet remains feasible longer term.

    Again, even if you could do this, it would not tell you that the system was random. We know it’s not random. It obeys the laws of physics. If your model does not satisfy the laws of physics, then it is nonsense.

  121. Jaap says:

    I beg to differ. Nothing is random, yet everything is random. It’s all a matter of complexity and scale.
    Seen from higher levels the lower level physical processes often behave in ways which can only be effectively described using stochastics, i.e. as if they were truly random.
    Certainly in complex physical systems like thermodynamics, weather prediction and yes climate.
    Whether something like true randomness exists at the lowest level is AFAIK still a matter of some debate. Many quantum physicists seem to believe that, but Einstein didn’t like that idea: “God doesn’t gamble”.

    The understanding that one should use stochastics in modelling the climate and climate change may have come a bit late, and perhaps wasn’t studied enough even recently, but in fact scientists have been doing that for decades. One of the first that I’m aware of is Hasselman, 1976 (http://empslocal.ex.ac.uk/people/staff/gv219/classics.d/Hasselmann76.pdf).
    And more do so every single day.

    So read up, below are some random links and quotes.
    Or just Google ‘Stochastics climate’ 🙂

    1999, Models for stochastic climate prediction

    Click to access 14687.full.pdf

    2005, Modelling climate change: the role of unresolved processes
    http://rsta.royalsocietypublishing.org/content/363/1837/2931?utm_source=TrendMD&utm_medium=cpc&utm_campaign=Philosophical_Transactions_A_TrendMD_0
    Abstract
    Our understanding of the climate system has been revolutionized recently, by the development of sophisticated computer models. The predictions of such models are used to formulate international protocols, intended to mitigate the severity of global warming and its impacts. Yet, these models are not perfect representations of reality, because they remove from explicit consideration many physical processes which are known to be key aspects of the climate system, but which are too small or fast to be modelled. The purpose of this paper is to give a personal perspective of the current state of knowledge regarding the problem of unresolved scales in climate models. A recent novel solution to the problem is discussed, in which it is proposed, somewhat counter-intuitively, that the performance of models may be improved by adding random noise to represent the unresolved processes.

    2008, An applied mathematics perspective on stochastic modelling for climate
    http://rsta.royalsocietypublishing.org/content/366/1875/2427

    2008, Introduction. Stochastic physics and climate modelling
    http://rsta.royalsocietypublishing.org/content/366/1875/2419
    Abstract
    Finite computing resources limit the spatial resolution of state-of-the-art global climate simulations to hundreds of kilometres. In neither the atmosphere nor the ocean are small-scale processes such as convection, clouds and ocean eddies properly represented. Climate simulations are known to depend, sometimes quite strongly, on the resulting bulk-formula representation of unresolved processes. Stochastic physics schemes within weather and climate models have the potential to represent the dynamical effects of unresolved scales in ways which conventional bulk-formula representations are incapable of so doing. The application of stochastic physics to climate modelling is a rapidly advancing, important and innovative topic. The latest research findings are gathered together in the Theme Issue for which this paper serves as the introduction.

    2014, Stochastic Climate Theory and Modelling
    https://arxiv.org/abs/1409.0423

    2015, Stochastic climate theory and modeling

    Click to access wcc318.pdf

    Abstract
    Stochastic methods are a crucial area in contemporary climate research and are
    increasingly being used in comprehensive weather and climate prediction mod-
    els as well as reduced order climate models. Stochastic methods are used as
    subgrid-scale parameterizations (SSPs) as well as for model error representation,
    uncertainty quantification, data assimilation, and ensemble prediction. The need
    to use stochastic approaches in weather and climate models arises because we
    still cannot resolve all necessary processes and scales in comprehensive numeri-
    cal weather and climate prediction models. In many practical applications one is
    mainly interested in the largest and potentially predictable scales and not neces-
    sarily in the small and fast scales. For instance, reduced order models can simulate
    and predict large-scale modes. Statistical mechanics and dynamical systems theory
    suggest that in reduced order models the impact of unresolved degrees of freedom
    can be represented by suitable combinations of deterministic and stochastic com-
    ponents and non-Markovian (memory) terms. Stochastic approaches in numerical
    weather and climate prediction models also lead to the reduction of model biases.
    Hence, there is a clear need for systematic stochastic approaches in weather and
    climate modeling. In this review, we present evidence for stochastic effects in labo-
    ratory experiments. Then we provide an overview of stochastic climate theory from
    an applied mathematics perspective. We also survey the current use of stochas-
    tic methods in comprehensive weather and climate prediction models and show
    that stochastic parameterizations have the potential to remedy many of the current
    biases in these comprehensive models.

  122. Jaap,
    This is going nowehere relatively fast. Let me make two key points.

    1. What Doug Keenan calls the Met Office model is not really a model. It is simply a data analysis method. It allows one to determine the properties of the dataset being considered, such as the linear trend and the uncertainty in that trend (it is descriptive, rather than inferential, statistics). It can tell us nothing about what is causing what is observed, it can tells us nothing about what happened befored we had data, and nothing about what will happen in the future. Keenan’s argument against the Met Office is simply a strawman.

    2. If you want to understand the causes of the warming, you need a physically-motivated model. A statistical model, alone, can tell you nothing. It might allow you to identify patterns, but – by itself – it can tell you absolutely nothing about what is causing the warming.

    Nothing is random, yet everything is random. It’s all a matter of complexity and scale.
    Seen from higher levels the lower level physical processes often behave in ways which can only be effectively described using stochastics, i.e. as if they were truly random.

    Yes, but this does not mean that if we can find some randomly generated time series that matches the observations, that the observations are indicating some kind of random process. We understand the underlying physics of our climate, and what’s been observed cannot be explained by some kind of natural random variation. This would violate the laws of physics.

    You also need to read your references more closely. They’re not arguing that climate change is random, they’re simply arguing that one can incorporate stochastic processes into models to improve the accuracy. Clearly some of the underlying processes do behave in a way that means that they could be modelled using stochastic processes. This does not, however, mean that the observed changes are random.

  123. Hyperactive Hydrologist says:

    Stochastic down scaling is widely used method for down scaling GCMs for regional impact studies. UKCP09 is an example of stochastic climate modelling which is widely used in the UK for climate change risk assessments, flood modelling and environmental impact assessments among others.

    http://ukclimateprojections.metoffice.gov.uk/21678

  124. Jaap says:

    >> You might like to read some of Richard Telford’s posts.

    Yeah I’ve read them :).
    I would agree with Richard that Keenan’s ARIMA alternative is too generic, if only because then almost anything fits.
    And also, as I think I also pointed out, that Keenan may reject AR1 for good reasons, but that doesn’t mean that his ARIMA model is a better alternative.
    There likely will be a better alternative which is neither of those two, or as Richard says (https://quantpalaeo.wordpress.com/2013/05/28/testing-doug-keenans-methods/):

    “Keenan should remember that AIC can only indicate the best model, from a purely statistical point of view, of those tested — there may be a much better but untested model.”

    >> Yes, but this does not mean that if we can find some randomly generated time series that matches the observations, that the observations are indicating some kind of random process.

    Yes, we know is not really random, but that is not the point.
    The point is that once we use statistical techniques on the data series of that process we have to use the proper tools. And which tools one can use depends on which class of (*pseudo*)randomness that process belongs to.

    When calculating the trendline the IPCC used GLS (and REML), assuming AR(1). CCSP used OLS, also assuming AR(1).
    That doesn’t mean that they think the underlying process is truly random, just that it is close enough they can use a certain method (OLS or GLS assuming that the residuals conform to AR1) for the calculation of a statistic (the trend) which is only valid for all processes which are like AR1.

    Keenan’s issue was that they didn’t check whether it was valid to assume AR1 when calculating their trendline. He then states that it wasn’t because the dataset fails statistical tests for AR1.

    >> You also need to read your references more closely. They’re not arguing that climate change is random, they’re simply arguing that one can incorporate stochastic processes into models to improve the accuracy.

    I did read them, and I agree with them.
    I’m not saying climate change is random.
    I was just arguing that you better model the unexplained parts of your physical models using stochastics. And that it was quite common to do so now.

    Calculating a statistic over the whole dataset is quite similar, you are then treating the entire model, and not just the unexplained parts, as a blackbox of a certain class.

  125. Jaap,

    The point is that once we use statistical techniques on the data series of that process we have to use the proper tools. And which tools one can use depends on which class of (*pseudo*)randomness that process belongs to.

    Okay, here’s a question I’ve asked Doug Keenan and for which I’ve never got a response. Maybe you can try. The standard OLS analysis produces a linear trend and an uncertainty in that trend. In other words, it gives us an estimate of the rate at which we’re warming (of course, our warming isn’t strictly linear, but at least we can understand a linear trend value). What does Doug Keenan’s analysis tell us? In other words, when he presents his analysis, what do we learn about the system via the dataset being analysed?

  126. Also,

    Keenan’s issue was that they didn’t check whether it was valid to assume AR1 when calculating their trendline. He then states that it wasn’t because the dataset fails statistical tests for AR1.

    The trendline isn’t determined using AR(1); we can get the linear trend using basic OLS. As I understand it, AR(1) is only used to determine the uncertainty in the trend because the residuals are correlated. You could use something different, but you’d need a justification for that. My understanding is that the use of AR(1) is based on the timescale over which we would expect the residuals to be correlated.

  127. Willard says:

    > Seen from higher levels the lower level physical processes often behave in ways which can only be effectively described using stochastics, i.e. as if they were truly random.

    Like seasons.

  128. Jaap says:

    >> Okay, here’s a question I’ve asked Doug Keenan and for which I’ve never got a response. Maybe you can try. The standard OLS analysis produces a linear trend and an uncertainty in that trend. In other words, it gives us an estimate of the rate at which we’re warming (of course, our warming isn’t strictly linear, but at least we can understand a linear trend value). What does Doug Keenan’s analysis tell us? In other words, when he presents his analysis, what do we learn about the system via the dataset being analysed?

    We sadly learn nothing. Or not a lot.
    He may imply we learn something when we use ARIMA instead, but we don’t really.

    His position is: AR1 is not good, so trendline is no good.
    And: ARIMA(3,1,0) does fit, but then we have no trendline.

    So he implies that there therefore is no trend, but that is not proven.
    There merely is no significant trend when you assume that ARIMA model.
    But that’s a bit cheesy because it is a class that almost always ‘fit’s’.
    As in a shoe that is way too large for you still fits. But its not a good fit…
    Also as Richard Teller points out the ARIMA(3,1,0) model is physically unlikely.

    So some other model for the distribution must be found. Or somehow ensure that you have enough data points, because a more generic model will require more data to be significant.
    And you can always calculate a trend and say it grows x per year. That is useful in itself, even when it is technically speaking not proven to be significant due to issues with the underlying distribution.

  129. Jaap,

    We sadly learn nothing. Or not a lot.
    He may imply we learn something when we use ARIMA instead, but we don’t really.

    Well, yes, exactly.

    His position is: AR1 is not good, so trendline is no good.

    Well, since the linear trend does not depend on AR(1), this is silly. All that the linear trend analysis tells us is what the linear trend is. AR(1) is used for estimating the uncertainties in that trend. Noone is claiming that there is some kind of linear trend hiding in the data; all that is being illustrated is the linear trend as a simple estimate for the rate at which we’re warming.

    And: ARIMA(3,1,0) does fit, but then we have no trendline.

    Well, yes, and therefore we cannot provide some simple descriptor of the data. The point of data analysis is to provide information that we can then use to further understand what is happening. A data analysis technique that doesn’t provide any useful information isn’t much use.

  130. Jaap says:

    >> As I understand it, AR(1) is only used to determine the uncertainty in the trend because the residuals are correlated.

    Correct. It only comes into play to determine how uncertain the trend line is. Imagine it as not just a line, but a line with diverging boundaries. Using AR1 when that is not valid simply means that those bound are too narrow. But that shouldn’t displace the center trend line IMHO.

  131. Jaap,

    Using AR1 when that is not valid simply means that those bound are too narrow. But that shouldn’t displace the center trend line IMHO.

    Exactly, so if someone wants to argue for a different analysis that produces a different uncertainty interval, that would be fine (assuming they could justify it). However – given the data – it seems unlikely that the uncertainty could be wildly different to what is estimated using AR(1).

  132. Jaap,
    Not sure if you’re still reading this, but here is an article from 2015 that might be worth reading.

Leave a comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.