Understanding Lewis (2013)

After listening to Nicholas Lewis present evidence at the Select committee hearing this week, I thought I would try and understand his 2013 paper (An objective Bayesian Improved Approach for Applying Optimal Fingerprint Techniques to Estimate Climate Sensitivity). I read through it, and didn’t really get what he was doing (at least not in any detail). So I then downloaded two papers on which it is largely based (Forest et al. (2002) and Forest, Stone & Sokolov (2006)).

Having read the two Forest papers, I think I understand what they did. They used the MIT, two-dimensional climate model. For each model run, they specify the climate sensitivity (S), the ocean diffusivity (Kv), and the net aerosol forcing (Faero). The ocean diffusivity essentially determines the rate of deep ocean heat uptake. As I understand it, Forest et al. run this model using different values of these 3 parameters and then compare the model outputs with the observed global-mean surface temperature and with the deep ocean temperature. In the later of the two Forest papers, they conclude – using expert priors – that climate sensitivity has a 90% confidence interval of 2.2 to 5.2 K, and that the net aerosol forcing has a 90% confidence interval of -0.62 to -0.05 Wm-2.

As I understand it, Lewis (2013) takes the data from Forest and uses an improved method to reduce the 90% interval for climate sensitivity to 2.0 – 3.6K. Lewis (2013) then goes on to say

Incorporating 6 years of unused model simulation data and revising the experimental design to improve diagnostic power reduces the best-fit climate sensitivity. Employing the improved methodology, preferred 90% bounds of 1.2–2.2 K for ECS are then derived (mode and median 1.6 K).

The main results from Lewis (2013) are illustrated in the figure below (taken from Lewis 2013).
So, an improved Bayesian method can reduce the climate sensitivity range a little, but adding 6 years of new data changes it completely. I find that quite remarkable, if not slightly worrying. Equilibrium climate sensitivity is a long-term response to a doubling of CO2. If the value you estimate changes dramatically by adding 6 years worth of data, that might suggest your method isn’t robust to short-term variations.

I was also surprised that the climate sensitivity could be a low as 1.2 K. We’ve already had around 0.9 degrees of warming since 1880, and we’re not even close to having doubled CO2. That will take another 50 years or so. If the latest Cowtan & Way (2013) paper is correct, then we’re likely currently warming at 0.1oC per decade. At that rate, we’ll have warmed by 1.3 degrees by 2050/2060. We would almost certainly not be in equilibrium at that time, and so the ECS would seem to have to exceed 1.3 K.

To try and understand this a little more, I decided to have a look at Otto et al. (2013) who use energy budget constraints to estimate the ECS and TCR. Their results are summarised in the table below.
If you consider the top row only, then their analysis also returns a lower value for the ECS of 1.2 K. To estimate the ECS you can use
where ΔQ is the change in radiative forcing, H is the system heat uptake rate, ΔT is the change in global mean temperature, and Q2x is the change in forcing after a doubling of CO2 (Otto et al. use values from Forster et al. (2013)Q2x = 3.44 ± 0.84 Wm-2).

So, the only way I can see to get an ECS of 1.2 K, is to use extremes. Using the top row of the above table, set ΔT = 0.95 K, ΔQ = 2.53 Wm-2, Q2x = 2.6 Wm-2, and H = 0.37 Wm-2. Using these values I get an ECS of 1.14 K. The immediate problems are that the OHC data suggests that the total system heat uptake rate is probably at least double what I’ve used here. Also, to get such a low ECS I’ve used a radiative forcing today that is almost as big as that due to a doubling of CO2, which doesn’t really make sense. Looking at Forster et al. (2013), which lists model estimates of ΔQ and Q2x, a large ΔQ is associated with a large Q2x, as one might expect. Maybe there’s another way to get an ECS of around 1.2 K using values that make more sense and if I get a chance I’ll have a go at this, I just can’t see – at the moment – how it is possible.

So, I realise that this post is a little convoluted. Maybe others have read and understood Lewis (2013) better than I have. At the moment, I don’t see how adding 6 years of data can really change the climate sensitivity as significantly as his analysis suggests, and I don’t see how a climate sensitivity of 1.2 K makes sense. This seems to require some really extreme values of certain parameters. Of course, if anyone has any thoughts or a deeper understanding of this analysis, feel free to explain it in the comments.

This entry was posted in Climate change, Climate sensitivity, Global warming, Science and tagged , , , , . Bookmark the permalink.

35 Responses to Understanding Lewis (2013)

  1. The extra six years Lewis adds are 1996-2001, which includes the extremely warm 1998. I would have expected the addition of extra data to narrow the uncertainty, but not greatly change the estimate.

    Lewis likes to proclaim that his method used objective Bayesian statistics. But of course the paper is depends on subjective choices. For example, he chooses to use a simple 2D (zonally averaged) climate model and not to rerun the model to include the very warm decade 2000-2010. Very convenient choices that were computationally necessary a decade ago. But now? Perhaps Lewis is rather too fond of his low sensitivity result.

  2. Richard,

    I would have expected the addition of extra data to narrow the uncertainty, but not greatly change the estimate.

    Yes, that’s what I would have expected. Instead it appears to have produced a result that barely overlaps with what one gets if those 6 years are not included. Very strange and a result that would normally lead one to try and understand why it produces such a large change.

    Something I was going to add to the post, but didn’t, was the issue of the data. I believe that Lewis was critical of Forest for not providing the data quickly enough. If the 2D code is available, then it’s not obvious that Forest was obliged to do so. It might be decent to provide the data, but if Lewis could simply have rerun the code to produce his own data, then it wasn’t necessary. Also, Lewis might have learned something from doing so.

    Perhaps Lewis is rather too fond of his low sensitivity result.

    Yes, that’s certainly one impression I get from reading his submission to the select committee.

  3. Joshua says:

    Not sure if looking over this comment and the ensuing discussion will be of interest, but it might…


    I’ve seen Pekka make similar critiques of Lewis’ use of “objective” priors before, and while I don’t understand any of this very well, from what I can parse it seems to me like an interesting critique.

  4. Joshua says:

    I believe that willard can link to some interesting discussions between Lewis and Annan about Lewis’ methodology – which if I recall correctly, Lewis left as a bit of a cliff-hanger.

  5. > I believe that willard can link […]

    Perhaps you are referring to this, Joshua:


    In that case, yes, I can. In fact, I think I already did. AndThen asked me a question at the time about objective bayesianism. I did not respond, but found an interesting paper, to which I could link too. Need to find it back, though.

  6. Joshua,
    Thanks for the link to Pekka’s comment. It’s something I had wondered myself. I thought one of the points of Bayesian analysis was that you use informed priors. So, you could eliminate regions of parameter space that are unphysical.

    Yes, I’ve read the discussion between James Annan and Nic Lewis. It was rather left hanging. I’ve been trying to find the link or Willard’s previous comment. They’re both escaping me at the moment.

  7. Willard beat me to it 🙂

  8. In the ensuing discussion, Bart R points to this:


    PS: Like NG, Bart R belongs in my fantasy draft. I don’t mind much that he’s a minarchist, because I like is style. My team needs four right-wingers anyway.

  9. Willard,
    Very interesting, and a little odd. So, when Lidardoni & Forest use the method in Lewis (2013), they get results similar to what they get with their method (the modes are more pronounced and the distributions slightly narrower). I can’t see any that are consistent with a 1.2 – 2.2 K 90% range though.

  10. Interesting that in Libardoni & Forest (2013) the HAdCRUT3 TCR (estimated using Lewis’s method) is very close to having a 90% range that is 1.2 to 2.2 K. Coincidence?

  11. BBD says:

    Amazing what you can do with “objectivity”, isn’t it, ATTP?

    I have a real problem with NL.

  12. BBD says:

    IIRC Karsten said some time ago there were whispers of problems replicating some of NL’s results, but I cannot recall if he was referring to L13 specifically. I also recall Karsten saying that NL was profoundly convinced that his findings were correct, which is also noteworthy in any consideration of scientific objectivity and polemics.

  13. I thought NL had managed to get a low value for CS by using extreme values for the aerosol forcing. So, not just-another-6-years but rather a different value entirely?

  14. BBD says:

    Paging Karsten…

  15. William,
    I wondered the same, but if I understand what he’s done, he’s simply taken all the model runs done by Forest (each of which has an CS, an aerosol forcing and a ocean diffusivity) and then done the analysis slightly differently. He seems to match Forest et al. when the same time period is considered but then gets a completely different range when adding the next 6 years. I’m just very confused as to how that modest change can completely change the CS range. It is true though that for his extended analysis, the aerosol forcing that he estimates is lower than that in Forest et al. (2006).

  16. BBD says:

    I’m confused too. How can any informative estimate of TCR never mind ECS be so sensitive to 6 years of data? To me it seems that it can’t, but presumably I’m mistaken on this point.

  17. BBD,
    In the discussion, the paper say

    We resolve these issues by employing only surface and deep-ocean diagnostics, revising these to use longer di- agnostic periods, taking advantage of previously unused post-1995 model simulation data and correctly matching model simulation and observational data periods (mis- matched by 9 months in the F06 surface diagnostic). Using the revised diagnostics, estimates of Seq are lower and more tightly constrained, with a 1.1–2.9-K range obtained using the F06 method and 1.2–2.2 K using the new method. Switching from the original HadCRUT observational surface temperature dataset to the updated HadCRUT4 may have contributed significantly to this reduction: Ring et al. (2012) reported that doing so caused a 0.5-K reduction in their Seq estimate.

    But, really, changing from the HadCRUT to the HadCRUT4 makes a significant difference? In the Libardoni & Forest (2013) paper that Willard linked to above, HadCRUT3 has a 90% ECS range of 2 – 5.5K.

  18. BBD says:

    What to say? I’m as baffled as wot you is. Or perhaps sceptical might be a better word…

  19. I can only see the abstract, not the paper. But it would be pretty odd to use the same aerosol forcing data in 2013 as in 2002 or 2006, no?

  20. Oh, yeah. Maybe http://scienceblogs.com/stoat/2012/12/20/people-if-you-want-to-argue-with-stoats-first-read-enough-to-be-a-weasel-parrots-neednt-apply/#comment-24749 ? Point (4). That’s not me, but RD is sane. I don’t think that’s exactly the same paper, but it would be odd if it wasn’t close.

  21. William,
    I may be wrong, but I think that what Rob Dekker is talking about are the calculations that ended up in Otto et al. (2013), not those used in this paper. As I understand it, in this paper 2D MIT climate models are run, each with a choice of CS, aerosol forcing, and ocean diffusivity. The models outputs are then compared with global mean surface temperatures until 1995 (Forster 2006) and deep ocean temperatures. So, you then get a PDF for CS, aerosol forcing and for ocean diffusivity. Lewis (2013) then repeated this using – I think – the same model runs as Forster et al. (2006) and then extended the period till 2001 and somehow the ECS range dropped from 2.0 – 3.6K, to 1.2 – 2.2 K.

    RD does indeed seem sane and what he says makes sense. The odd thing is that Lewis is an author on the Otto et al. paper, which produce CS ranges that are in the table I include in the post. Quite how he can push a CS range of 1.2 – 2.2 K (with a mean/median of 1.6 K) when the Otto et al. calculations seem to show that the range is much more likely to be 1.2 – 4 with a mean of at least 2, and RD’s caveats also hold (i.e., using GISSTemp would increase the likely CS).

  22. Paul S says:

    According to a write-up of the Forest method they setup the aerosol forcing by reference to an estimate for the 1986 atmospheric distribution of anthropogenic sulphate aerosols. They assume this basic spatial pattern holds proportionally throughout the model run up to 1995 and the spatial forcing is scaled accordingly. However, the reality has been that the spatial distribution of anthropogenic sulphate emissions has shifted massively since 1986, from the extratropics to the subtropics and tropics.

    I don’t have access to the Lewis paper but if he uses the same method for aerosol setup one issue is that the model spatial aerosol forcing will be increasingly divergent from reality by going beyond 1995, although I’m not sure just six years more could make a huge difference.

    Regarding your comparison with the Otto et al. method, I’m not sure the Forest results fit either. For aerosol forcing the upper end of the 90% range in Libardoni and Forest 2011 is -0.83W/m2 which is close to the best estimate aerosol forcing used for Otto et al. 2013, yet the corresponding sensitivity in L&F2011 is 5.3ºC. It’s possible the forcing setup isn’t immediately comparable but there does seem to be some non-linearity between global net forcing and sensitivity in these results.

  23. Paul S,
    Thanks. Yes, I’m finding the aerosol issue a little confusing. Libardoni & Forest seem to have a range from -0.9 to -0.1 Wm-2. Otto seems to use a range of -1.44 to -0.71 Wm-2, so not much of an overlap there.

  24. @BBD: Sorry for being late ;). I was indeed referring to L13 re reproducibility, but cautioned against taking it too serious. I don’t know any more details as of now … so you might count the Libardoni/Forest paper as some evidence for it. I didn’t have the time right now to go through the details of the papers in question again (particularly re aerosol forcing), but I concur with Paul S in that I find the low aerosol forcing in Libardoni and Forest a bit odd (IIRC there was a similar issue in one of their earlier papers). Therefore I remain a bit skeptical about the method in general and NL’s attempts in particular. His low-sensitivity-bias is hardly reconcilable with unconditional objectivity … which is only my personal opinion, to be sure. Perhaps I’ll add something later …

    As an aside as James’ Sensitivity thread got mentioned: In case anyone ever reads it again, please note that I confused a few things re OHC in one of my comments: [julesandjames.blogspot.co.uk]
    Just wanted to set the record straight re my own mistakes, which sometimes happen.

  25. Karsten,
    Thanks. Since you’re here, maybe you could clarify some things with regard to aerosols. I noticed that Paul S made a similar comment to what he made here over at Stoats (I guess Paul could clarify too if reading this). Is it possible that even the IPCC aerosols estimates are too low, or have I misunderstood what Paul S was suggesting. Also do you have any insight into Nic Lewis’s aerosol comments that he made at the select committee hearing. He seemed to be implying that the climate were getting it wrong and that the aerosol forcings they were using were too high. Is that what he’s suggesting and is he right?

  26. Paul S says:

    Not that you should care too much about my view on the AR5 authors’ expert judgement but the final best estimate of -0.9W/m2 seems reasonable based on the evidence available so far. I think it’s on the low side of reasonable, in that -0.8 would be a stretch but don’t think anybody would have raised an eyebrow at -1.0 or -1.1.

    My comment at Stoat’s was specifically about an accounting error in the Second order draft which did make their best estimate figure of -0.9W/m2 too low (less negative) with regard to the logic of their estimation method. In the final version they altered their method to a strict expert judgement decision. Even though the best estimate remained the same in the end it was presumably made with full knowledge of what the satellite obs. data really indicated so there is no longer a clear error. The final judgement is rather opaque, which makes it hard to argue against on specifics, but if you were to look at the full range of evidence on your own I think it would be difficult to argue that their choice was wrong.

    It is worth noting that the First Order Draft actually didn’t give a best estimate at all for forcing due to aerosol-cloud interaction, only a 90% range, which suggests at least some aerosol authors don’t consider there to be enough evidence for anything as specific as a best estimate.

    The other point I made was about a longwave adjustment factor added to the satellite estimates. I do find this quite easy to argue against because nothing in the chapter indicated that even the sign of this factor is known, yet they make a definitive judgement which alters the estimates by about 20% on average. But again, we don’t really know whether this adjustment affected the expert judgement at all.

    Hi Karsten

    I don’t suppose you’ve heard anything about where the longwave adjustment came from?

  27. Paul S says:

    I haven’t heard Nic Lewis’ testimony (is there a transcript anywhere?) so don’t know the context for any statements he made about aerosol forcing being too large. Was he just talking about the change from AR4 to AR5?

    I know when the AR5 SOD was leaked he decided that the erroneously low satellite average quoted (-0.7W/m2) should be regarded as the correct figure to use, even being resistant to making some basic checks of the original papers when the errors were pointed out to him and direct links to those papers provided. He also produced some convoluted arguments for why he could throw out inverse estimates which produced larger values.

  28. Paul S,
    Thanks. The transcript isn’t up yet. At least, I couldn’t find it earlier. I think, though, that he was simply talking about the change from AR4 to AR5.

  29. @anders: Guess Paul S answered the point re AR5 aerosol forcing estimate already. Nothing much to add. Otto et al. was indeed designed to demonstrate that the lowered aerosol forcing is still consistent with their ECS range. It’s also true that their initial satellite estimate averaging wasn’t very well thought out. The aerosol LW forcing assumption remains to puzzle me slightly (see quick reply to Paul below).

    I also haven’t heard Nic Lewis’ testimony, so can’t comment on the details. But probably the same old story that CMIP5 aerosol forcing is too strong compared to “observations”. Given that many models still doesn’t include secondary effects (e.g. aerosol indirect effect), it’s barely backed up by the available evidence. Had a short encounter with him about the AR5 forcing change over the 1950-2011 and 1980-2011 period on a rather dull venue. Turned out I was right on that occasion: [climateaudit.org]
    Interesting side note: Shortly after commenting there I received my first ever hate mail (two to be precise). Spambot style only, yet pretty nasty. So be prepared when engaging there 😉

    @Paul S: Re the aerosol LW forcing, I never heard back from Nicolas Bellouin. Guess I have to give it another go. Asked Ken Carslaw back in November and he wasn’t quite sure what’s been done in AR5 either. Speaking of Ken, his recent Nature paper made it abundantly clear that banging on pinning down the 1750-2011 total anthropogenic aerosol forcing estimate is pretty pointless, as we won’t know the preindustrial reference state with good precision any time soon. Ironically, if we focus on the last 30 or 60 years (see linked discussion above with NL), without a stronger reduction in total aerosol forcing past 1980 than proposed in AR5, we’re ending up with a TCR well above 1.4K. And there is nothing much NL can do … cauze AR5 simply says so.

  30. Pingback: Energy budgest constraints | And Then There's Physics

  31. Pingback: Lewis en Crok over klimaatgevoeligheid – meer gevoel voor p.r. dan voor wetenschap | Klimaatverandering

  32. Pingback: Sensitivities and things | And Then There's Physics

  33. Pingback: GWPF optimism on climate sensitivity is ill-founded

  34. OPatrick says:

    Nic Lewis’s analysis of the Shindell paper at Climate Audit is interesting. I don’t mean scientifically interesting – although it may well be, I just don’t know enough of the details of the case or the science to judge – but rather in the tone he has taken. I think many people have been trying to reserve judgement about Lewis, although the people he associate with makes this difficult, but if even a small fraction of the criticisms he makes of the Shindell paper are not fully justified it will be difficult to see how he could be acting in good faith. He appears to know exactly what buttons to press to appeal to the auditorium and is bashing away at them for all he’s worth. I’d be interested to hear what anyone better informed than me makes of this.

  35. Pingback: NASA study fixes error in low contrarian climate sensitivity estimates | Dana Nuccitelli |

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s