## Nic Lewis’s prior beliefs

Nic Lewis has apparently had a paper published in the Journal of Climate, called Objective Inference for Climate Parameters: Bayesian, Transformation of Variables and Profile Likelihood Approaches. Nic discusses his new paper in a Climate Audit blogpost called Paper justifying AR4’s use of a uniform prior for estimating climate sensitivity shown to be faulty. The paper that Nic is referring to is Frame et al. (2005), which attempted to use prior assumptions to constrain climate forecasts.

The basic issue seems to the following. Frame et al. (2005) wanted to determine the likelihood range for the Equilibrium Climate Sensitivity, $S$. To do this they started with a basic energy balance model (Hansen et al. (1985), Andrews & Allen (2008)) in which each model has a specified sensitivity ($S$) and ocean diffusivity ($\kappa_v$ – which determines the rate at which energy is transferred from the well-mixed ocean layer to the deep ocean). One can then run a whole suite of models which, after integrating over $\kappa_v$, can be then be used to determine the likelihood of a particular observation (surface temperature, ocean heat content) given a particular sensitivity $S$. By combining this with the probability of that observed quantity and a prior assumption about the sensitivity, one can then determine the probability density function (PDF) of $S$, given the known observations (I’m not an expert at Bayesian inference, so may not have explained that properly).

In the method of Frame et al. (2005), the prior for $S$ was assumed to be uniform in the temperature range from 0K to 20K. This means that their prior assumption is – for example – that an equilibrium sensitivity between 2 and 3K is the same as one between 14 and 15K. If you consider their basic result (shown in the figure below) the PDF for $S$ has a long tail extending to high values and their 5-95% range is 1.2 – 11.8 K (they’ve just published a correction suggesting that it should actually be 1.2 – 14.5 K).

Probability density functions for climate sensitivity from Frame et al. (2005).

Nic Lewis’s issus with Frame et al. (2005) is that the use of a uniform prior for climate sensitivity – extending to 20 K – is wrong, and I have to say that I probably agree. It seems highly unlikely that the equilibrium climate sensitivity will be above 10K, or even close. Instead, Lewis argues for using what is called a Jeffrey’s prior, which is apparently “uninformative” (although I’m not quite sure what this means). Consequently his prior (shown in the figure below) has a decreasing probability with increasing $S$ and $\kappa_v$. An obvious problem with this is that it seems to assume that the most likely climate sensitivity is 0K, and there is no physical justification for such a low value.

Jeffrey’s prior for S from Lewis (2014)

Lewis therefore uses the same basic method as Frame et al. (2005) but uses a Jeffrey’s prior, rather than a uniform prior. The basic result is in the figure below. The different curves include the Frame et al. (2005) result and different methods used by Lewis. I’ve got a little confused by the different curves, but I think the dark blue is the original Frame et al. (2005) result, and the basic bottom line is that the 5 – 95% range is now 1.2 – 4.5 K (rather than 1.2 to 14.5 K).

PDFs for climate sensitivity from Lewis (2014).

So, my basic thoughts on this are that a uniform prior producing a 5 – 95% range of 1.2 – 14.5 K seems wrong. Lewis’s use of a Jeffrey’s prior seems to produce a result that is more consistent with the standard IPCC range (1.5 or 2 – 4.5 K), so I’m not sure why he seems to imply that the Frame et al. (2005) work somehow reflects on the IPCC. I’m also not sure why he felt the need to refer to a 10 year old paper as being faulty, rather than being more constructive (although after the Hawkins vs Mora issue, maybe this is more common in climate science than I’m used to). There are a couple of other issues – in my opinion – with Lewis’s work. The energy balance models are fairly simple. As Andrews & Allen (2008) say

It is important not to over-interpret this model: in particular, the assumptions that λ and C are time and forcing-invariant are over-restrictive.

and they’re unable to incorporate possible inhomogeneities in the aerosol forcings (e.g., Shindell et al. (2014)). Most papers who use these models mention these simplifications, but I couldn’t find any mention of this in Nic Lewis’s paper.

Furthermore, the choice of a Jeffrey’s prior would seem to make a low climate sensitivity much more likely than one might expect. As I understand it, James Annan has already suggested that Lewis is wrong to argue that a Jeffrey’s prior is the most appropriate prior to use. Additionally, Annan & Hargreaves (2009) appear to have already addressed the issues of using a using a uniform prior, as done by Frame et al. (2005). In fact, I found their paper much more physically motivated than Nic Lewis's paper, which seems to be largely a statistical argument. The figure below is probably the key one from Annan & Hargreaves, which shows the PDF for climate sensitivity using a uniform prior and two others, both of which assume that high and low sensitivities are unlikely (which is what we would really expect). The non-uniform priors seem to produce 5 – 95% ranges of about 1.5 – 4.5K, consistent with the IPCC and also, broadly, with Lewis's latest paper.

Anyway, this post has got rather long and a bit convoluted. I don’t have any huge issues with Nic Lewis’s latest work other than I don’t see why the Jeffrey’s prior is somehow optimal, and I’m not sure why he has to imply something with respect to the IPCC when it seems as though his results are broadly consistent with theirs. I’m also not sure why he’s chosen to ignore Annan & Hargreaves (2009) – who he appears to not even cite – when they seem to have pointed out the issue with Frame et al. (2005) about 5 years ago. As usual, if anyone has any thoughts or would like to correct anything I’ve said here (which, given that I’m still trying to get my head around Bayesian statistics, may well be necessary) feel free to do so through the comments.

This entry was posted in Climate change, Global warming, IPCC, Science and tagged , , , , , , , , . Bookmark the permalink.

### 128 Responses to Nic Lewis’s prior beliefs

1. AnOilMan says:

I think pseudo skeptics like statistical methods because its harder to see how they relate to reality. Like Richard Tol’s mythical 300 papers. (Note: I’m still willing to pay $1000 to the person who can find them.) 2. Arthur Smith says: As I understand Bayesian analysis, if you have enough data, your prior doesn’t matter (unless for some reason your prior is zero or infinitesimal in regions the data don’t constrain), because the data themselves force convergence on the real probability distribution no matter what prior distribution is chosen. So arguments about which prior is the correct one are essentially proxy arguments for the reality that we don’t have enough data to be sure what the real probability distribution is. The only solution is more data, and a bit of humility in arguing about it. Bayesianism is a recognition that statistical analysis is inherently subjective – your conclusions necessarily depend on whatever biases you come in with, and only when the data gradually forces different people’s biased views to converge can you be pretty certain you have it right. 3. Arthur, Yes, that certainly makes sense. There was a comment by James Annan on his post that basically seemed to be saying that many use uniform priors because they’re quite easy to use and are often not badly wrong. That would seem to be consistent with your suggestion that if you have enough data, then as long as your prior isn’t a stupid choice, you’re going to get a sensible answer. 4. Oh, and I absolutely agree with this The only solution is more data, and a bit of humility in arguing about it. 5. AoM, I think there are certainly who like to make statistical arguments because they sound complicated and meaningful and hide a lot of possible issues. That’s why I included a discussion of the models used. They’re basically fairly simple and can’t capture all the complications. It doesn’t matter how complicated your statistical model is, if your physical model is simple, your result will not necessarily be a truly accurate representation of reality. 6. Hoi Polloi says: Nic Lewis mentions A&H2009 and relevant bogpost in a PS in his post on CA (advised by Steve McIntyre). AnOilMan, do you have any specific critics or are you just shooting from the hip? 7. Hoi, I think that was added after it was pointed out, and I think he very strongly argues against their priors (and I think Nic Lewis is wrong to do so). Unless I’m mistaken, his paper does not cite their work. 8. And then there’s new Paleoclimate evidence that ESS is probably 6 C for each doubling of CO2 per Hansen et al… http://droyer.web.wesleyan.edu/Franks_et_al_2014_GRL_new_stomatal-CO2_proxy.pdf 9. robert, Thanks, I’ll have a look at that. Of course, the models used by Frame et al. (2005) and by Lewis (2014) only really include fast-feedbacks and so cannot determine the climate sensitivity (ESS) with both fast- and slow-feedbacks. 10. “The only solution is more data, and a bit of humility in arguing about it.” If it is possible, I would also argue to make so many measurements that you do not need statistics. However, this is not always possible. In climatology the amount of data only increases by one year every year. In such cases you have to make the best out of what you have got. Thus A&H2009 already presented the main result that a uniform prior can lead to artificially thick warm tails. Do I understand the above post right, that that was supposed to be the main contribution of Lewiss? Then I start to wonder what the value of the new Lewiss paper is? That looks like a major error by Lewiss and the reviewers. To me it sounds especially big, because even I had heard of this problem with uniform priors and I have no expertise whatsoever in estimating climate sensitivity nor in Bayesian statistics. Maybe it was a coincidence, that I picked this up at some conference, but otherwise quite an error. 11. Eli Rabett says: Of course, Annan pretty much tore Frame and Allen to shreds in 2006, but it is important to note that the issue was not the uniform prior, but the unphysical range they used. Lewis falls right into the same hole. The prior is a subjective belief. If it is crazy, so will the results be unless you have enough data that you really don’t need a prior. 12. Lewis has subjective belief that objective priors are better. After his disastrous foray into radiocarbon dating, where the posteriors obtained using his objective prior were simply nonsense (as James Annan pointed out), rather damaged his credibility. 13. Captain Flashheart says: The Jeffrey’s prior is not uninformative – quite the opposite. A uniform distribution would be the classic example of an “uninformative” prior, but the idea of “uninformative” is dubious to start with. A prior should inform your model with data from outside the model, or physical knowledge. In this case, why not construct a prior distribution close to the range of sensitivities in a good climate model? Surely real physicists all agree that sensitivity near 0 is impossible? 14. [Mod: Be nice, please] “researchers” such as Nic Lewis don’t know physics. The ocean diffusivity and its thermal response is a temporally fat-tailed effect, as described by Hansen long ago. The fat-tail contains all the uncertainty in the prior that Lewis is hallucinating about. With a fat-tailed thermal response one gets a quick uptake and a longer time-scale uptake, leading to a range of effective ECS values depending on how long wants to wait to reach a quasi-equilibrium. Everything is buried in that fact — a steep damped exponential prior won’t work, that’s for certain as it will completely miss the fat-tail. As AT says, ” if your physical model is simple, your result will not necessarily be a truly accurate representation of reality.” Lewis has the wrong basic physical model. Isaac Held is one of the scientists who knows that first-order thermal diffusion doesn’t work but explains that people use it because it is convenient to do the math. But convenience is no excuse for getting the physics wrong. The first-order needs to be replaced with fat-tail response while people like Lewis need to be scolded or scorned or both. The correct way to approach this analysis IMO is to compare land and ocean warming and estimate a path to an ECS that will also reconcile the OHC data and the AGW forcing function. .http://contextearth.com/2014/01/25/what-missing-heat/ The land warming provides a good idea what an eventual ECS will look like because the thermal mass is not there to add to the long-range uncertainty. 15. Fred Moolten says: The paper by Lewis doesn’t refer to “equilibrium climate sensitivity” (ECS), but merely “climate sensitivity”. Unwary readers might infer that he was estimating ECS, but he was actually estimating “effective climate sensitivity”, using values for ocean heat uptake and negative aerosol forcing that are likely to prove too low, and disregarding the evidence for a time-varying feedback parameter that also leads effective climate sensitivity to underestimate ECS. He is probably correct that earlier uses of uniform priors for ECS that included the possibility of very high values was misguided. The use of the Jeffreys prior is questionable (see, for example, recent discussions in the Climate Dialogue website), and it is hard to justify any prior that allows for climate sensitivity values of zero or very close to it. 16. Tom Curtis says: Anders: 1) The IPCC finds a likely (66-100%) range of ECS of 1.5-4.5 C; that it is extremely unlikely (<1%) to be less than 1 C, and that it is very unlikely (0-10%) to be more than 6 C. They do not give a 5-95% range (or a 2.5-97.5% range), nor a median or mean value. It is obvious, therefore, that Nic Lewis's estimate of 1.2-4.5 C (5-95%) range is much more tightly bound than the IPCC range, in addition to having a lower mean and median. If he were correct, it would be good news. 2) James Annan explains the nature of "objective" or "uninformative" priors in a recent blog post on Nic Lewis’ analysis of carbon dating (on which also see this): “All this stuff about “objective priors” is just rhetoric – the term simply does not mean what a lay-person might expect (including a climate scientist not well-versed in statistical methodology). The posterior P(S|O) is equal to to the (normalised) product of prior and likelihood – it makes no more sense to speak of a prior not influencing the posterior, as it does to talk of the width of a rectangle not influencing its area (= width x height). Attempts to get round this by then footnoting a vaguer “minimal effect, relative to the data” are just shifting the pea around under the thimble.” So, in a nutshell, “uninformative” priors are pure advertorial in a naming convention. They provide a base prior without epistemic commitment in the same manner as a uniform prior, and are probably superior to uniform priors in most applications, but can be disastrously wrong in other applications. They are not superior to expert priors in general, and in some cases are clearly inferior. (Pekka Pirila elucidates at greater length here.) 3) Arthur Smith suggests in Bayesian analysis, given non-zero priors they posterior probability will eventually match the observations. It is better to say that they will approach the limit of matching observations given infinite observations. We are clearly no-where near having sufficient observations of climate sensitivity for that yet to be the case. If we were, Lewis’s paper would have found the same result with or without the Jefferey’s prior, which is clearly not the case. So, while technically true, Smith’s observation is irrelevant to the issue raised by Lewis. 17. Tom Curtis says: Captain Flasheart, a uniform distribution is an uninformed prior, ie, a prior that reflects no prior knowledge about the probability distribution in question. As it happens, a Jeffrey’s prior is also an uninformed prior in this manner. In contrast, intuitively, an uninformative prior would be a prior which provides no information about the posterior probability. In fact, such a prior is impossible in Bayesian probability. See the Annan quote in my preceding post. 18. Tom Curtis says: Fred Moolton, Lewis is definitely discussing ECS. If he were discussing the effective climate sensitivity, he would not need to include the ocean heat diffusivity as a relevant factor. 19. Fred Moolten says: Tom – I believe he used ocean heat diffusivity as a path to estimating ocean heat uptake. He was not estimating ECS although he used some model estimates for some of the data he employed to arrive at values for OHC uptake and net forcing – the values he needed along with temperature change to arrive at his conclusions.. Nic Lewis has always focused on effective climate sensitivity, but sometimes it’s easy to interpret him as discussing ECS. Whether this confusion is intentional is not something I’ll try to judge here. 20. Tom Curtis says: Fred, referring back to Otto et al (2013), Equilibrium Climate Sensitivity = the change in temperature times the forcing of a doubling in CO2/(Change in forcing minus change in heat flux), where the heat flux is essentially heat flux into the oceans. In contrast, effective climate sensitivity equal the change in temperature times the forcing of a doubling in CO2/ change in forcing. The ocean heat flux does not enter into the calculation, and effective climate sensitivity can be estimated from temperature and forcing data alone. That is the reason it was defined in the first place. Consequently, while I agree with you that ocean heat diffusivity is being used as a proxy for ocean heat flux in Lewis’s latest paper, I still maintain he is estimating ECS. If he is not, he has included one variable too many. Further, as in Otto et al (or which Lewis is a coauthor), Lewis has on other occasions clearly focused on ECS. I am not claiming that he has always done so, nor that he has properly distinguished between the two. But on this occasion, as in Otto et al, he clearly focuses on ECS. 21. Captain Flashheart says: Tom Curtis, I agree with your points. I don’t believe a uniform distribution is ever “uninformative” or that it contains no information about the probability distribution in question. The assumption that all values are equally likely is a very strong and restrictive assumption, as is the assumption that the distribution is symmetric about its mean. Such strong assumptions are not (to use your phrase) “without epistemic commitment,” it’s just that most human beings give a special epistemic value to concepts like uniformity and symmetricity, to the extent that they don’t realize they are even making these assumptions. After all, if we wanted a genuinely uninformative prior, we would have to include negative values of climate sensitivity, right? To allow the possibility that CO2 causes cooling. We don’t because we wish to inform the model that this is impossible. Why privilege only this assumption, as the uniform prior does, and if we are going to assume it cannot be negative, why then also assume it should be close to 0 (as Nic Lewis appears to be doing)? I think he’s trying to have his cake and eat it too… 22. Tom Curtis says: On the choice of priors, Annan and Hargreaves express a preference for expert priors, and write: “In the absence of a cloistered expert, one reasonable approach would be to look back through the literature to see what climate scientists actually wrote prior to the observation and analysis of modern data sets. After what was perhaps the earliest early estimate for S of around 5 C (Arrhenius, 1896), all subsequent model-based estimates have been clearly lower (Manabe and Wetherald, 1967; Hansen et al., 1983), culminating in the “likely” range of 1.5–4.5 C (NRC, 1979). This estimate was produced well in advance of any modern probabilistic analysis of the warming trend and much other observational data, and could barely have been affected by the strong multidecadal trend in global temperature that has emerged since around 1975. Therefore, it could be considered a sensible basis for a credible prior to be updated by recent data.” Personally, I would prefer using the smoothed distribution of values of modern OA-GCMs to generate priors. Models are, in effect, the most precise encapsulation of our current theories. What they generate is not evidence, per se, but our predictions based on those theories. Rather than being used as one more line of evidence which is then synthesized with other data to produce an estimate of the ECS, their epistemological status would be much better encapsulated by using them to generate a prior which is then used to generate a posterior estimate from the other data. And alternative approach would be to used climate sensitivity estimates from paleo data to generate a prior for modern data. The intuition here is that climate sensitivity will vary depending on different conditions on Earth, including the arrangement of the continents and the extent of glaciation. Modern conditions, therefore, are unlikely to match those estimate from the full range of paleo conditions, but will likely lie in the range of previous conditions. Hence paleo conditions generate a suitable prior for estimating modern conditions. I don’t think any of these approaches stands out as clearly superior (or inferior) to the others. All would result in a more tightly constrained estimate of ECS than a uniform prior without introducing an obvious bias towards low values as does the Jefferey’s prior (which assigns a maximum a priori likelihood to a climate sensitivity of 0, and an ocean diffusivity of 0 – both of which are clearly unphysical values). 23. Fred Moolten says: Tom – The confusion is widespread. Otto et al estimated effective climate sensitivity (EFS) rather than ECS while referring to it as ECS. Their only acknowledgment that they were conflating the two was one reference (reference 13 if I recall) to the paper by Kyle Armour et al describing how EFS is likely to underestimate ECS. In essence, energy balance models using observations derived under non-equilibrium conditions estimate EFS, according to the formula N = F – αΔT, where N is the planetary energy imbalance estimated from ocean heat uptake, F is the change in forcing,, and α is the feedback parameter. If α were constant, ECS could be derived from EFS by calculating α from the above equation and observational data, and then, as you imply, deriving ECS as F(2xCO2)/α. Because α appears to diminish as the climate progresses toward equilibrium, this yields spuriously low ECS values. Subsequent to Otto et al, a number of papers employing the same method have been more careful to indicate that they are estimating EFS, and that its equivalence to ECS requires the assumption of a constant α. 24. > The Jeffrey’s prior is not uninformative – quite the opposite. Perhaps uninformed might be more accurate, since the idea seems to be to preserve the illusion of an econometric objectivity. 25. Tom Curtis says: Fred Moolten, after reading Armour et al, and a few other sources, I see that you are right. Obviously I the widespread confusion had spread to me, and I have misread the definition of “effective climate sensitivity”. Thankyou for dissipating my confusion. 26. Tom, Thanks for the clarification regarding the IPCC range. Fred, Thanks for clarifying EFS versus ECS. I remember we had a brief discussion about this a while back, but I had forgotten the difference. 27. In a Bayesian analysis, posterior =p= prior x likelihood, where prior and posterior describe what we know about the parameter of interest, in this case the equilibrium climate sensitivity. The likelihood has the information contained in the data. A Jeffrey’s Prior is defined such that posterior =p= likelihood. In other words, the prior adds no information. We only know what the data tell us. Frame made two mistakes: First, they mistook an informative prior for an uninformative one. (Or they overlooked the Jacobian, which is the same thing.) Second, and this follows, their unwittingly informative prior did not have credible information. 28. Richard, I’m not really following what you’re saying. It seems that “informative” and “uninformative” mean something different in this context to what a layperson might imagine. Also, even though a uniform prior extending to 20K gives an unphysically high probability of a high sensitivity, a Jeffrey’s prior seems to give an unphysically high probability of a low sensitivity (so, I don’t quite follow how you can say a prior adds no information). I don’t see how one can use statistical arguments to justify the use of what is seems to be an unphysical prior. I’m also not quite sure why people are making a big deal out of an issue with a paper published in 2005, especially as Victor points out, people were well aware of this issue many years ago (from Annan & Hargreaves 2009, I assume). I’m also surprised that this isn’t presented more positively. The range from Lewis (2014) is similar to that generally presented by the IPCC. As Tom points out, it is narrower (1.2 – 4.5 K, rather than ~1 – ~6 K) but all of this work is essentially determining (as Fred highlights) the Effective Climate Sensitivity, rather than the Equilibrium Climate Sensitivity, so that isn’t all that surprising. 29. There’s no prior that adds no information. Some priors add information chosen explicitly by the analyst, other priors add, what they happen to add. Jeffrey’s prior adds information that’s dependent on the way empirical data is used. That’s in no way unique and independent of the choices the analyst has made. In applications like the one considered here the models chosen for use in the analysis have a strong influence on the Jeffrey’s prior. Formulating even the same models in different ways would change Jeffrey’s prior. Jeffrey’s priors are reasonable rules to use, when nothing is known about the methods and the ways empirical data is collected and used, but they have no deeper significance as non-informative prior. 30. Tom Curtis says: Richard Tol, in Bayesian analysis, the posterior {P(A|B)} equals the likelihood {P(B|A)} times the prior {P(A)} divided by the probability of B {P(B)}. The posterior is proportional to the likelihood times the prior. Further, P(A) can only result in P(A|B) = P(B|A) if P(A) = P(B). We typically do not know P(B), which is why the proportionality relationship is often used. Perhaps you meant merely that using the P(A|B) is proportional to P(B|A). I assume that is what you meant by “=p=”. However, we have to assume the Jeffery’s Prior is proportional to P(B) and that an automatic procedure can infallibly pick out a proportional value to an unknown PDF based solely on knowledge of the likelihood. In short, your claim is nonsense. What is more, Lewis himself provides two examples in which the Jeffrey’s prior is clearly not proportional to P(B), and in which the posterior is not proportional to the likelihood, those examples being from his discussion of C14 dating, and his discussion of climates sensitivity discusses above in which his prior peaks at a climate sensitivity of 0 C, whereas his posteriori peaks at a climate sensitivity near 2.2 C. Finally, if it were indeed an automatic virtue that our posterior should match the likelihood, then the logical step would be to become a frequentist and stop pretending that bayesian methods had any virtue at all. That, however, is not the case and Lewis needs to provide more reason for using an unphysical prior than calling the advertorial in a naming convention he currently uses. 31. The problem of the uniform prior in S was and to some extent still is that it gives a very high weight for high values of sensitivity, and that the empirical evidence is very weak in excluding (or confirming) high sensitivities. The combination of very high prior probability and empirical data of little evidential power leads to these problems. All priors with a relatively strong cutoff at high S give rather similar results. How the prior behaves for small S (less than 1) is of less significance as the data has mower power there. For this reason Lewis gets rather similar results as all others, who use explicitly or implicitly a prior with a cutoff for high S. Typical forms of this cutoff are 1/S or 1/S^2. I like the latter more as it corresponds to a bounded prior for feedback strength near one, while the uniform prior in S has a strong divergence at that point and 1/S a weaker divergence.. 32. Maybe someone who knows more about Bayesian statistics can clarify something for me. If I was naively to try and determine climate sensitivity (EFS) using models, I might simply run a whole suite of models and then compare the results for the 20th century with observations. That would – I think – be a frequentist approach. The whole point (I thought) of Bayesian analysis was to use some prior knowledge of the quantity you’re trying to constrain, to then tighten your constraint on this quantity. If so, using a prior that is largely unconstrained by prior knowledge seems no better than doing a simple frequentist analysis. Possibly, this is what Tom and Pekka (and others) have already said, but maybe someone could clarify, for my benefit, if not for anyone else’s. 33. Tom Curtis says: Pekka, if I understand you correctly, you prefer a prior of 1/S^2 simply because higher values of S are not well constrained by the evidence. That in turn looks to me like only looking for your lost coins under the street light rather than where you dropped it because you can better constrain the search under the street light. The approach is backwards, and introduces a bias for the wrong reasons. If the evidence to constrain higher sensitivities is difficult to come by, that is just a constraint we face in our empirical approach, not a reason to change our priors so as to pretend the problem does not exist. In Bayesian theory, the prior is the a priori estimate of the probability of a given value. A priori, we certainly do not have any reason to think that a climate sensitivity of 1 is nine times more probable than a climate sensitivity of three (which is what your prior suggests). Indeed, based on a back of envelope calculation, we have good reason to think the climate sensitivity is around 3 C per doubling of CO2. Specifically, taking the albedo adjusted TOA insolation as the sole forcing, and the GMST as the temperature response, then the climate sensitivity factor is 0.823. Ignoring albedo it rises to 1.181, with a climate sensitivity of 4.4 C. Including CO2 forcing, the climate sensitivity factor drops to 0.697, with a climate sensitivity of 2.6 C. Consequently, we know going into the problem that any reasonable prior, ie, prior estimate of the probability that A, will include that range of values in their most likely range, or very near it. If it does not, and if you choose your prior to cut of the likelihood of a posterior PDF with high probabilities at high values of for high climate sensitivities, then you are no longer choosing a prior probability density function at all, and are undercutting the entire theoretical basis of using Bayesian methods. 34. Tom Curtis says: Sorry, messed up on the simple calculations. The climate sensitivity factors and climate sensitivities from the simple calculations should be: TSI + albedo + CO2: 1.04 C/Wm-2 and 3.9 C TSI + albedo: 1.2 C/Wm-2 and 4.5 C TSI: 0.8 C/Wm-2 and 3.1 C 35. Rob Nicholls says: I’ve been trying and failing to get my head around Bayesian stats for a while now. Beyond the very basic stuff which was covered in a simple OU textbook that I found, it all seems bewildering. Intuitively it seems to me that Lewis’s pretty 3D graph of the Jeffrey’s prior suggests that the Jeffrey’s prior would in this case pull the posterior distribution towards lower estimates of climate sensitivity, but I don’t know enough to be able to say that Lewis’s choice of prior is mistaken or suboptimal. I found an old preliminary draft paper by Yang and Berger(www.stats.org.uk/priors/noninformative/YangBerger1998.pdf) which says ““Note that in specific situations Jeffreys often recommended noninformative priors that differed from the formal Jeffreys prior” and that “there are many well known examples in which the Jeffreys prior yields poor performance” due to dependence among parameters, but I’ve no idea whether this is relevant here. 36. Rob, I think what you say about Jeffrey’s prior is correct. As Tom says, the posterior is proportional to the likelihood times the prior (divided by the probability of the observational quantity), therefore the choice of prior must effect the posterior and the higher the probability of a low sensitivity in the prior, the greater the probably must be in the posterior. As Pekka says, however, the influence of this is not that great, given that the data constrains the low-end of the sensitivity quite well. So, in some sense, using a Jeffrey’s prior doesn’t seem to produce too silly a result. That still doesn’t mean – in my view – that it is the optimal prior as the expert priors that Annan & Hargreaves use, seem to produce sensible results and are more physically motivated than the Jeffrey’s prior (that I don’t think Lewis even attempts to motivate physically at all). 37. Eli Rabett says: ATTP, Annan & Hargreaves 2009 was kicking around as a preprint in various forms from 2006 at the latest. Simply could not find a home. The people who wrote the AR4 were certainly aware of the issue. As James wrote in 2007 Section 9.6 “Observational Constraints on Climate Sensitivity”, contains the following: “Note that uniform prior distributions for ECS [equilibrium climate sensitivity], which only require an expert assessment of possible range, generally assign a higher prior belief to high sensitivity than, for example, non-uniform prior distributions that depend more heavily on expert assessments (e.g., Forest et al., 2006).” Many people may think this statement is too trivial to be worth making much of, but when I made essentially the same point about a uniform prior implying high prior belief in high sensitivity, Allen and Frame dismissed it as “just a rhetorical flourish”. This statement from the IPCC also appears to directly contradict much of the peer-reviewed literature, which claims that uniform priors represent ignorance. It is encouraging to see that it is now the consensus of 2,500 climate scientists that this is not the case 🙂 38. Eli Rabett says: To another point, Eli notes that Nic Lewis acknowledges Annan and Hargreaves (2009) and the work that A&H did prior to that wrt the uniform prior, in his Climate Audit post but hears that this is not acknowledged in the paper itself. If so, this is exactly the sort of behavior that would send Steve McIntyre over the moon in full flight. Strokes and folks Eli imagines. 39. Eli, Thanks. I gather that Steve McIntyre was the one who pointed out that Nic had not acknowledged Annan & Hargreaves. There’s probably still time to get it into the paper, but it seems to be available in Early Online Release, so maybe not. It’s pretty poor that a referee didn’t notice this and pretty poor that Nic didn’t include a reference, as he must have known about their paper. AFAICT, from his comments on the Climate Audit post, he thinks their expert prior are unjustified. I would argue that they’re more justified than a Jeffrey’s prior and – AFAIA – if you aren’t going to use prior information to set your prior, it’s not obvious why one would use Bayesian techniques. 40. Rob Nicholls says: Thanks ATTP. As usual I’ve been rather slow to spot the obvious, which you and several commentors have pointed out. It’s dawned on me that perhaps I don’t need to become an expert in Bayesian analysis to spot the fact that Lewis’s prior, with a climate sensitivity of zero as the most likely figure, is ludicrous. Even most pseudoskeptics seem to acknowledge that a climate sensitivity of zero is physically impossible. It seems that Lewis has found, perhaps inadvertantly, a way of incorporating one of the more ridiculous denier myths into a statistical model. It is framed as a Jeffrey’s prior (a complex statistical concept) and that gives it a veneer of respectability that perhaps it doesn’t deserve. I’m happy to be corrected – I can’t pretend to understand the Jeffreys prior or its strengths or limitations compared to other priors. 41. Marco says: Eli, ATTP, at least the failure to include Annan and Hargreaves tells us neither reviewed the paper. 42. Marco, Yes, that may be true. However, I have a real aversion to mentioning my own papers when reviewing other papers. I’m currently, however, in a quandary as I’m busy reviewing a paper that has ignored quite of lot of my work that would probably be relevant. I suspect that I will overcome my quandary 🙂 43. Tom, That was not my reason. The reason is rather that a high sensitivity requires a feedback coefficient that’s very close to one. I consider feedbacks to be based on physical mechanisms that do not favor strongly values very close to 1.0. If this is accepted, the 1/S^2 tail is an unavoidable conclusion. Giving such a role for feedbacks is not the only possibility, but that’s what I consider most reasonable, and therefore my favored behavior for the tail. Another way of looking at the situation is to notice that empirical data tells little about the possibility of large S. Therefore arguments about such values are not based on empirical data but on other arguments. That was clearly the case in 2005. AR5 tells that now we have more empirical evidence against very high S. Still it may be that other arguments are more restrictive. 44. @Tom C “we have to assume […] that an automatic procedure can infallibly pick out a proportional value to an unknown PDF based solely on knowledge of the likelihood” The automatic procedure is described by Jeffreys and many a textbook since. @Pekka Let’s not go there again. Jeffrey’s Prior is uninformative given the likelihood. 45. Richard, Let’s not go there again. Why not? I think it might be interesting. 46. Richard, There are many uninformative priors. Jeffreys proposed a simple rule that’s fully reasonable, but it can be applied to any set of parameters. After a choice is fixed all further transformations leave the result invariant. Choosing some specific presentation of empirical data as the fundamental space is not justified by any solid principle. The particular choice may be based on some common practice or it may be subjectively picked by the analyst. Neither makes the particular choice less informative than many other choices. Deciding that Jeffreys’ prior is used before the consequences are known reduces potential for intentional subjective influence, but does not make the choice particularly uninformative. There simply isn’t any solid argument to justify that Jeffeys’ prior is particularly uninformative. 47. Steve koch says: “Maybe someone who knows more about Bayesian statistics can clarify something for me. If I was naively to try and determine climate sensitivity (EFS) using models, I might simply run a whole suite of models and then compare the results for the 20th century with observations. That would – I think – be a frequentist approach. The whole point (I thought) of Bayesian analysis was to use some prior knowledge of the quantity you’re trying to constrain, to then tighten your constraint on this quantity. If so, using a prior that is largely unconstrained by prior knowledge seems no better than doing a simple frequentist analysis. Possibly, this is what Tom and Pekka (and others) have already said, but maybe someone could clarify, for my benefit, if not for anyone else’s” I’m not an expert on Bayesian analysis but I spent a few years programming Bayesian models to do realtime real world prediction. In those applications, the point of the prior was to constrain the solution space by rejecting events that were no longer physically possible and to increase the probability (in a prorated way) of the events that were still possible. The Bayesian approach worked well in this particular set of applications (though it was like pulling teeth to get it right). If you were to try to do this approach with the climate models (which is a cool idea), I think the climate models would have to be orthogonal wrt each other (i.e. represent different, non overlapping parts of the solution space), which is not the case currently and may be impossible to do. 48. Steve, Thanks. In those applications, the point of the prior was to constrain the solution space by rejecting events that were no longer physically possible and to increase the probability (in a prorated way) of the events that were still possible. Indeed, that is what I thought the whole point of Bayesian analysis was – use your prior knowledge to reject regions of parameter space that are unphysical. Therefore, if you use statistical arguments only (ignoring your prior knowledge of the physical system) then I don’t really see the point of doing Bayesian analysis. If you were to try to do this approach with the climate models (which is a cool idea), I think the climate models would have to be orthogonal wrt each other (i.e. represent different, non overlapping parts of the solution space), which is not the case currently and may be impossible to do. I’m not quite sure what you’re suggesting here. What I was suggesting is that one can use climate models that depend only on climate sensitivity and ocean diffusivity to model the 20th century warming. You then weight these by how well the results match the known 20th century warming (which has it’s own probability, given the uncertainties). That, by itself, would – I think – be a frequentist approach. I you were to then combine this with a prior, you’d get a Bayesian approach. 49. Steve koch says: “I’m not quite sure what you’re suggesting here. What I was suggesting is that one can use climate models that depend only on climate sensitivity and ocean diffusivity to model the 20th century warming. You then weight these by how well the results match the known 20th century warming (which has it’s own probability, given the uncertainties). That, by itself, would – I think – be a frequentist approach. I you were to then combine this with a prior, you’d get a Bayesian approach” I thought you were referring to the existing climate models, huge chunks of existing code, whoops. Yes, what you propose could be done easily but, aside from ease of implementation, why do you want to simplify the models to just climate sensitivity and ocean diffusivity? For example, Climate sensitivity is not necessarily a fixed number, it might vary quite a bit based on a large number of parameters. 50. Steve, why do you want to simplify the models to just climate sensitivity and ocean diffusivity? I don’t. Those are the only two parameters in the models used by Frame et al. (2005) and in Nic Lewis’s recent paper. That’s the only reason I used that as an example of a possible model. For example, Climate sensitivity is not necessarily a fixed number, it might vary quite a bit based on a large number of parameters. Again, I’m not quite sure what you mean. Presumably there is – in reality – a single climate sensitivity. Of course, different models may produce different values, but that’s kind of why you need to do these statistical analyses. An additional issue is that the models used by Frame et al. (2005) and in Nic Lewis’s recent work make some simplifying assumptions that may not be correct in reality. Hence the climate sensitivities that their models suggest are probably lower estimates, rather than accurate representations. However, that may be what you were pointing out at the beginning of your comment. If one were to do the Lewis analysis in more detail one could use full GCMs, rather than simple energy balance models. 51. For the sake of argument let’s suppose that the models agree on all details including the heat capacity of the oceans, with the exception of the rate at which the oceans take up the extra heat attributable to global warming. (This is certainly one of the biggest unknowns in my feeble attempts to understand sensitivity.) As an abstraction of this situation, consider a resistor R of unknown value in series with a capacitor C = 1000 uF, with adequate voltage rating. After connecting 10V across this circuit, the voltage across C, initially 0, will rise asymptotically to 10V. We would like to know the voltage across C after one second. Q1: Is assuming a prior of a given kind (Jeffreys, whatever) in the range 0-10V for voltage across C after one second equivalent to assuming a prior of the same kind for R? Q2: If not, which prior of that kind would you prefer? In particular does your answer depend on the kind of prior? 52. Tom Curtis says: Pekka, you’re going to have to help me out here. According to the IPCC, the climate feedback parameter (α in the equation α=(ΔQ-ΔF)/ΔT) is the inverse of the climate sensitivity factor(ie, the λ in the equation ΔT =λΔF). There is no a priori reason to think that either of these is limited to values below 1, and certainly not the climate feedback parameter, which gets increasingly small as the climate sensitivity gets increasingly large. I assume you are in fact referring to the “f” in the equation ΔT = ΔT(0)/(1-f), where f=f(Water Vapour)+ f(Lapse Rate)+f(Cloud Albedo Direct Effect)+…+f(Ice Albedo Effect), where the “…” includes all the individual feedbacks I have not enumerated. If so, then indeed we do not want f =>= 1 except in rare conditions in which f decreases with further temperature response. If that is the case, there are three problems with your argument. The first is that high f is not linearly correlated with high S, or S^2, so the argument does not justify the likelihood function. Second, we already have good reason to think that f is fairly high, and that values in the range up to 0.8 are quite plausible. The third is that to avoid a high risk of runaway warming or cooling, it is only necessary that further increases of f become increasingly improbable as f approaches 1. Given that changes likely to effect one feedback making it stronger (increased warmth increasing the strength of the WV feedback, for example), are likely to make others weaker (Ice albedo feedback), or increase the strength of feedbacks of opposite sign (Lapse Rate feedback), that last condition seems in fact to be realized. It follows that the argument that f must by << 0.9 is weak. Finally, if the current evidence suggests a climate sensitivity with a very likely range of 2.5-5 with one likelihood function, and 1-3 with another likelihood function, it is no justification of either function to say that the empirical data does not justify either the high or low values. Rather, both must stand on their merits. In this case the "merits" of the Jeffrey's function is that it is uninformed, ie, represents no knowledge about the specifics of the case, and it assigns highest likelihood to S=0 (which is as implausible physically as f=1). Neither is actually a merit, or at least the former is not a merit when we have objective relevant knowledge (as we do), and the later is a flaw with 1/S^2. In contrast either an expert prior based on the Charney report (as recommended by Annan) or based on a smoothed PDF of values from current climate models represents a reasonable estimate of what the physics tells us without detailed empirical knowledge of the case. They also both assign low likelihoods to unrealistically low and high values. 53. Captain Flashheart says: Richard Tol said: A Jeffrey’s Prior is defined such that posterior =p= likelihood. In other words, the prior adds no information. We only know what the data tell us. But there’s no way this can possibly be correct. The only way that posterior can equal likelihood (or, more accurately, have the same functional form in the proportional relationship) is if the prior is uniform, and strictly speaking if it is uniform on an infinite range (i.e. not a formal likelihood). There is no other way that the posterior can equal the likelihood. In fact this is only true for a Jeffrey’s prior when the likelihood is normal, in all other cases posterior and likelihood are not equal. The Jeffrey’s prior is NOT defined such that “posterior=p=likelihood.” It is defined such that the prior is equal to the square root of the determinant of the Fisher information. It is intended to be invariant to transformations of scale, and to be “objective” in the sense that it uses an objective rule based on the likelihood (i.e. data-driven). It is NOT intended to give a formal equivalence between posterior and likelihood. It is also not necessarily the case that a data-driven prior is a good idea (the clue is in the word “prior”) and, as Tom points out above, since we have an alternative source of prior information (physics!) there is no reason to restrict ourselves to such an uninformative prior based on the data only. Once again, I think Tol is showing his ignorance of statistical theory. he displayed it very well in that thread at Gelman’s place (which he ran away from), and I think he’s done it again here. 54. @TC: According to the IPCC, the climate feedback parameter (α in the equation α=(ΔQ-ΔF)/ΔT) is the inverse of the climate sensitivity factor(ie, the λ in the equation ΔT =λΔF). The precise wording (on p.1450 of WG1) is that “[the climate feedback parameter] varies as the inverse of the effective climate sensitivity.” First, the λ in your formula is not effective climate sensitivity but equilibrium climate sensitivity, ECS. Second, “varies as” is not the same as “is” since it allows the numerator to be other than unity. The definition continues as follows. “Formally, the Climate Feedback Parameter (α; units: W m–2 °C–1) is defined as: α = (ΔQ – ΔF)/ΔT, where Q is the global mean radiative forcing, T is the global mean air surface temperature, F is the heat flux into the ocean and Δ represents a change with respect to an unperturbed climate.” Oddly the definition in AR5 does not specify the numerator, though AR4 gives it as the no-feedback response to a doubling of CO2, frequently cited as 3.7 W/m2. So there is nothing to prevent both the effective climate sensitivity and the climate feedback parameter being substantially greater than 1. And this indeed happens in Chapter 13: the caption for Box 13.1, Figure 1 on p.1176 of AR5 WG1 says “The sum of these two terms is given for a climate feedback parameter α of 0.82, 1.23 and 2.47 W/m2/°C (corresponding to an equilibrium climate sensitivity of 4.5, 3.0 and 1.5ºC, respectively).” So even though AR5 has failed to specify the numerator in the definition, it is clear that some contributors to AR5 continue to take it to be 3.7 W/m2. 55. Tom Curtis says: Vaughan Pratt, granting your point that the climate sensitivity parameter (λ) is defined for the equilibrium condition, and the climate feedback parameter (α) is defined for non-equilibrium conditions, it still remains true that at equilibrium λ= ΔT/ΔF = 1/α. As we are discussing the use of priors in finding the PDF of the ECS, for this discussion we can treat the two as the inverse of each other without qualification. Further, the advantage of the climate sensitivity parameter (or factor) is that we need no longer specify climate sensitivity by the reference case of doubled CO2. Rather, we can specify it with reference to any forcing regardless of size (although the approximation decreases in accuracy with very large forcings). If we wish to determine the climate sensitivity with respect to a doubling of CO2 we need only find 3.7 x λ. I don’t think the points you raise require clarification beyond this, nor impact on my argument regarding Peka’s claims at all. 56. The ocean thermal diffusivity is a spread of values, and the typical approach to solving this kind of problem is to use the maximum entropy prior as an estimate of D. The problem actually simplifies form the classical heat equation errorFunction() solution. http://theoilconundrum.blogspot.ch/2013/03/ocean-heat-content-model.html And of course, the econometric crowd is starting to pick up on this approach, because they borrow everything from mathematical physics, see Maximum entropy distribution of stock price fluctuations Look at Figure 5 in particular, the diffusion coefficient has a damped exponential prior, representative of a maximum entropy estimate. If the mean is all we know, then the standard deviation is assumed equal to the mean. In the heat equation, the uncertainty is flipped due to a ratio distribution, and so goes in the denominator, thus giving the fat-tail 1/t^2 behavior. Therefore the Nic Lewis approach is irrelevant because he applies the wrong physics. 57. Richard S.J. Tol says: @Flashheart I was reasoning rather loosely. @Wotts Why not? Because Pekka and I agree on the substance, but disagree on how to present this to novices in information theory. 58. Richard, disagree on how to present this to novices in information theory. How so? Pekka thinks you should actually try and explain it, and you think you should just be rude and condescending? 59. I take it that, I was reasoning rather loosely. is code for “I was wrong”? 60. Richard S.J. Tol says: @Wotts If you wish. I don’t think a comments thread is the best place to do higher stats. Pekka disagrees. I’m not gonna argue with him about that. 61. Richard, You may well be right, but I’m not sure what this has to do with higher stats. The basic idea behind using Bayesian analysis seems fairly straightforward. People don’t need to understand higher stats to have a sense of whether or not a statistical argument alone is sufficient to determine some physical quantity. There seem to be some who are arguing that climate scientists need to work more closely with statisticians. From what I’ve seen of those who focus on statistics, while ignoring physics, this is highly unlikely to be beneficial, especially as there appear to already be some climate scientists who are extremely skilled statisticians (James Annan being one). 62. I have time for a short comment only. When variables are continuous, it’s impossible to give any uniquely preferred interpretation for uninformative as uninformative means that any set of values that occupies the same volume in the space of possible values is equally likely. That definition refers to a measure that tells, which volumes have the same size, but there’s an infinity of measures without any a priori preference for one of them. Only physical arguments may help in deciding, what’s more natural that some alternatives. The setup of data collection and handling is not such a physical argument, but that’s the approach that leads to Jeffreys’ prior. From the point of view of physics Jeffreys’ prior is just one out of an infinity, not automatically less informative than other choices. It’s true that Jeffreys’ prior is often a reasonable one, but it may be nonsensical as well as it is in the example of radiocarbon dating discussed recently. 63. @Pekka I disagree: The problem with the radiocarbon dating was degeneracy. 64. Richard, What does that actually mean? As I understand it, the Jeffrey’s prior produced a ridiculous result, hence I fail to see how what Pekka has said was wrong. You’re not doing a Keenan are you? 65. AnOilMan says: Richard, speaking of higher stats, I’m still offering$1000 for your ‘statistical’ papers. I say ‘statistical’ papers to differentiate them from real papers that do exist. (I note that you still haven’t correlated those papers to reality. Must be rough.)

66. AOM,
Judging by some of the discussions around this topic, correlating statistical analysis to reality is not high on some people’s agendas.

67. Tom Curtis says:

Anders, it may be safer to say that Jeffrey’s prior plus Lewis’s absurdly idealized example produced a ridiculous result. In real examples where the calibration curve is not monotonic, is never entirely flat, and has a definite width (due to uncertainty in calibration), the Jeffrey’s prior may be quite suitable. Or not – I don’t know enough to know either way. I suspect with real calibration curves, Jeffrey’s prior is a reasonable choice of prior, without believing it to be better than a uniform prior, or an expert prior based on other dating cues.

The same can be said on climate sensitivity. Jeffrey’s prior does not produces absurd results on climate sensitivity, and produces better results than the quasi-uniform prior. It just produces worse results than reasonable choices of expert prior.

68. Degeneracy was not at all the essential point. Jeffreys’ prior would be equally nonsensical for a case that’s as close to that case as can be reached without any degeneracy. The main point was that the method is well understood and known to distort the outcome. Jeffreys’ prior does not correct for this known distortion and leads therfore to highly informative outcome according to information known to be seriously wrong.

The case is so clear exactly, because the method is so well understood. The present case is different both, because the method is not well understood, and because Jeffreys’ prior seems reasonable. Being reasonable does not mean that it’s less informative than other priors defined aiming for uninformativeness.

In a sense Jeffreys’ prior is also in this case highly informative as it has the cutoff at high S. Tihat Jeffreys’ prior has the cutoff is probably related to the arguments that make me like the cutoff, but the agreement is incomplete and possibly accidental.

69. Pekka,

Tihat Jeffreys’ prior has the cutoff is probably related to the arguments that make me like the cutoff, but the agreement is incomplete and possibly accidental.

You at least appear to be trying to make an argument for using Jeffrey’s prior based on our knowledge of climate sensitivity. I don’t think Nic Lewis has tried to do the same.

70. @TC: it still remains true that at equilibrium λ= ΔT/ΔF = 1/α.

News to me. If λ is say 4.5 °C then α would be 0.22 W/m2/°C. Don’t you think that’s awfully low for the climate feedback parameter, regardless of how close the system is to equilibrium?

71. VP,
That’s not right is it. If EFS is 4.5oC, then $\alpha = 3.7/4.5 = 0.82$ Wm-2K-1, and $\lambda = 1.21$ K/Wm-2.

I’m also not sure I get your issue with

$\alpha = \frac{\Delta Q - \Delta F}{\Delta T}$.

If feedbacks are assumed to be linear, then this is true at any time, whether in equilibrium or not.

72. Captain Flashheart says:

Richard S.J. Tol, you weren’t “reasoning loosely”, you were wrong. W.R.O.N.G. Along with meta-analysis and regression, Bayesian stats is another thing you don’t understand. You really are a pernicious [Mod : redacted].

73. Michael 2 says:

Richard S.J. Tol says: “Pekka and I agree on the substance, but disagree on how to present this to novices in information theory.”

Consider me a novice and I have no idea which presentation will manage to soak in. I hope you will both argue your points in your own way so I can comprehend it although this might not be the place to do it.

74. Tom Curtis says:

Vaughan, the right hand equality of the equation, λ= ΔT/ΔQ = 1/α, follows from the definition of α from the IPCC as:
α=(ΔQ-ΔF)/ΔT)
and the fact that at equilibrium, ΔF = 0, where ΔQ is the change in forcing, and ΔF is the change in ocean energy flux (as per the IPCC definition).

The IPCC defines the “climate sensitivity parameter” as “…
(units: °C (W m–2)–1) refers to the equilibrium change in the annual global
mean surface temperature following a unit change in radiative forcing”

From the units, you can see the definition is the same as that from wikipedia, who write:

“Although climate sensitivity is usually used in the context of radiative forcing by carbon dioxide (CO2), it is thought of as a general property of the climate system: the change in surface air temperature (ΔTs) following a unit change in radiative forcing (RF), and thus is expressed in units of °C/(W/m2). For this to be useful, the measure must be independent of the nature of the forcing (e.g. from greenhouse gases or solar variation); to first order this is indeed found to be so.

The climate sensitivity specifically due to CO2 is often expressed as the temperature change in °C associated with a doubling of the concentration of carbon dioxide in Earth’s atmosphere.

For coupled atmosphere-ocean global climate models (e.g. CMIP5) the climate sensitivity is an emergent property: it is not a model parameter, but rather a result of a combination of model physics and parameters. By contrast, simpler energy-balance models may have climate sensitivity as an explicit parameter.

ΔT(s) =λ*RF

The terms represented in the equation relate radiative forcing (RF) to linear changes in global surface temperature change (ΔTs) via the climate sensitivity λ.

(I cite wikipedia only as a convenience rather than hunting through the IPCC and/or literature to find the equation in use 😉 )

From simple algebra, and switching to the IPCC’s symbols, we have λ = ΔT/ΔQ

Of course, λ =/= the equilibrium climate response to doubled CO2. Rather, λ*3.7 W/m^2 = the ECS, and -1/α = λ if the climate is in equilibrium. Because the definition of α is tied to the EFS, it may not may not equal -1/λ when ΔF =/= 0, and in current conditions is likely to be less than that (ie, EFS < ECS).

I apologize for using ΔF for radiative forcing in my earlier post, which in the context of the IPCC definition is liable to have caused confusion.

75. Tom Curtis says:

People pursuing this topic might be interested to note that Yang and Berger (1998) include a catalogue of “non-informative priors”, including the original “non-informative prior” advocated by Bayes and Laplace. That is interesting because one of Lewis’s main complaints about the uniform prior is that it is not “non-informative”, showing that despite his mathematical acumen, he really does not know what he is talking about when it comes to the meta-issues on which he bases so much of his reasoning.

76. @ATTP: That’s not right is it. If EFS is 4.5 °C,

My bad, I stupidly replaced EFS with λ without thinking, sorry about that. Ignore my last comment.

I’m also not sure I get your issue with α = (ΔQ − ΔF)/ΔT.

That was the formula I gave in my second last comment, citing AR5, with which I have no issue (relevant to this discussion anyway). I do have an issue with Tom’s formula α = ΔF/ΔT at equilibrium. If by ΔF in his formula he means what the IPCC writes as ΔQ − ΔF (which can be written Δ(Q − F), the change in the difference between radiative forcing and heat flux into the ocean). then I’m fine with it, but in that case I don’t see why he needs to specify “at equilibrium”. If he means anything else then I’d like to understand why that’s a good expression for α at equilibrium.

77. @TC: the fact that at equilibrium, ΔF = 0, where ΔQ is the change in forcing, and ΔF is the change in ocean energy flux (as per the IPCC definition).

I don’t see why ΔF should suddenly vanish at equilibrium. A change in forcing should result in a change in ocean energy flux, whether at equilibrium or not.

78. VP,
At equilibrium the net heat flux into the system (ocean mainly) should be zero (i.e., there should no energy imbalance), so I agree with Tom.

79. Anders, the reason you and Tom give for ΔF to be zero is equally a reason for ΔQ and ΔT to be zero.

In order to make sense of the expression (ΔQ − ΔF)/ΔT at equilibrium, the system must be nudged away from equilibrium by an amount ΔQ. This creates the necessary disequilibrium, however small, to result in ΔT being nonzero.

But to assume that ΔF remains zero while ΔT becomes nonzero is equivalent to claiming that there is no feedback.

What is your justification for defining the climate feedback parameter in a way that says there is no feedback?

80. VP,
I think you misunderstand the terms (Roger Pielke Sr does too, so no shame in that 🙂 ). Consider some time interval from time$= 0$ to time $= t$ : the term $\Delta Q$ is the change in radiative forcing over that time interval, $\Delta T$ is the change in temperature over the same time interval, and $\Delta F$ is the change in system heat uptake rate over that time interval (i.e, the difference between the system heat uptake rate at the beginning and at the end). If we consider a scenario where the system is initially in equilibrium (energy imbalance = system heat uptake rate = 0) and then imagine something (increased atmospheric GHG concentration, for example) producing a change in radiative forcing. Then at some later time (time = $t$), the change in radiative forcing is simply the change due to the increased GHG concentration, the change in temperature is very obviously the difference between the temperature at time $= 0$ and time $= t$, and $\Delta F$ is the difference between the system heat uptake rate at time $= t$ and time $= 0$. However, if the system was in equilibrium at $t = 0$ then $\Delta F$ is simply the system heat uptake rate at time $= t$ (typically, however, this is normally determined by averaging over some time interval – decade normally).

Consider now a scenario where we double CO2 and let the system evolve back to equilibrium. In such a scenario $\Delta Q$ will be the change in radiative forcing due to a doubling of CO2 (3.7 Wm-2), $\Delta T$ will be the equilibrium climate sensitivity (ECS or, maybe, EFS) and $\Delta F$ will be the difference between the system heat uptake rate at the beginning and at the end. However, since the system started in equilibrium (no energy imbalance) and ended in equilibrium (no energy imbalance) $\Delta F = 0 - 0 = 0$.

81. ATTP,

You at least appear to be trying to make an argument for using Jeffrey’s prior based on our knowledge of climate sensitivity. I don’t think Nic Lewis has tried to do the same.

Not really. I don’t like Jeffreys’ prior in general in physical science applications.

Kass and Wasserman discuss priors based on formal rules in the spirit of Jeffreys’ prior. There are situations, where it’s valuable that the prior is fixed by a rule that cannot be changed on case-by-case basis. In science that’s not an important value as correctness is more important than guaranteed limits on subjectivity.

Empirical methods may be connected to physical arguments that are a reasonable basis in the selection of the prior. In such cases Jeffreys’ prior may be justified, but without such arguments Jeffreys’ prior lacks justification in physical science applications.

82. Pekka,

Not really. I don’t like Jeffreys’ prior in general in physical science applications.

I wasn’t implying that you were trying to argue that it’s the best, but you have at least illustrated why it has some merit (from a physically motivated perspective) which is more than Nic Lewis appears to have done.

83. @ATTP: I think you misunderstand the terms

On the contrary, the terms ΔQ and ΔT in the expression (ΔF − ΔQ)/Δ for the climate feedback parameter α mean something different from the ones you’re using here to show that ΔF = 0. Otherwise ΔF would always be zero.

What you want is that whenever a system happens to be in equilibrium, ΔQ and ΔT mean what you say above. But when not in equilibrium then they have the meaning needed for the expression for the climate feedback parameter to make sense, which is quite a different meaning.

Since the latter meaning is just as valid for a system in equilibrium as for one not in equilibrium, you have in effect defined two notions of climate feedback parameter, one appropriate for RF (radiative forcing as in the traditional formula ΔT = λΔQ) and the other for ERF (effective radiative forcing as in the new formula ΔT = (ΔQ − ΔF)/α). The latter is consistent with AR5 but the former, namely the definition α = 1/λ, is not.

Nowhere in AR5 will you find any combination of formulas from which λ = 1/α can be inferred. And for good reason: λ goes with RF while α goes with ERF.

AR5 WG1 explains all this in several places, in particular pp. 573, 576, 662, 664, and 665 (Box 8.1).

Quoting from p. 576 (in Chapter 7):

“In this report, following an emerging consensus in the literature, the traditional concept of radiative forcing (RF, defined as the instantaneous radiative forcing with stratospheric adjustment only) is de-emphasized in favour of an absolute measure of the radiative effects of all responses triggered by the forcing agent that are independent of surface temperature change (see also Section 8.1). This new measure of the forcing includes rapid adjustments and the net forcing with these adjustments included is termed the effective radiative forcing (ERF). The climate sensitivity to ERF will differ somewhat from traditional equilibrium climate sensitivity, as the latter include adjustment effects.”

(Boldface mine.)

And from p.661:

“As in previous IPCC assessments, AR5 uses the radiative forcing (RF) concept, but it also introduces effective radiative forcing (ERF). The RF concept has been used for many years and in previous IPCC assessments for evaluating and comparing the strength of the various mechanisms affecting the Earth’s radiation balance and thus causing climate change. Whereas in the RF concept all surface and tropospheric conditions are kept fixed, the ERF calculations presented here allow all physical variables to respond to perturbations except for those concerning the ocean and sea ice. The inclusion of these adjustments makes ERF a better indicator of the eventual temperature response. ERF and RF values are significantly different for anthropogenic aerosols owing to their influence on clouds and on snow cover. These changes to clouds are rapid adjustments and occur on a time scale much faster than responses of the ocean (even the upper layer) to forcing. RF and ERF are estimated over the Industrial Era from 1750 to 2011 if other periods are not explicitly stated. {8.1, Box 8.1, Figure 8.1}”

(Boldface mine again.)

84. Roger Pielke Sr [misunderstands the terms] too

Do you have a link? If it pans out I’ll add it to my collection of climate misunderstandings.

85. Tom Curtis says:

Vaughan, from
1) α=(ΔQ-ΔF)/ΔT)
we have by straightforward steps
2)ΔF=ΔQ-αΔT
The units for α are (W/m^2)/K, so that the units for αΔT are W/m^2. More importantly, they tell us that the term represents a change in energy flux that changes approximately linearly with temperature. (“Approximately” because α changes value with time and with different forcing histories, although those changes are expected to be small.) The basic idea of energy balance models of the Earth’s atmosphere is that changes of temperature result in changes in OLR, and that if the energy balance is disturbed, changes in temperature will occur until the resulting change in OLR matches the initial perturbation, that is until
3) ΔQ=αΔT
from which it follows that at equilibrium, ΔF=0.

86. Tom Curtis says:

Vaughan:

“Nowhere in AR5 will you find any combination of formulas from which λ = 1/α can be inferred. And for good reason: λ goes with RF while α goes with ERF.”

No! α is related to Effective Climate Sensitivity, while λ is defined for the Equilibrium Climate Sensitivity. Both can be expressed in terms of Radiative Forcing (for convenience) or of Effective Radiative Forcing if we are distinguishing between the warming effects of different radiative forcings. However, “effective radiative forcing” and “effective climate sensitivity” are not defined with respect to each other, and no more related than “effective climate sensitivity” is related to “radiative forcing”.

Again, check the IPCC definition, where we are given equation (1) in the previous post, and tole “… Q is the global mean radiative forcing”.

87. Tom Curtis says:

Anders:

“I wasn’t implying that you were trying to argue that it’s the best, but you have at least illustrated why it has some merit (from a physically motivated perspective) which is more than Nic Lewis appears to have done.”

Nic Lewis attempts a very physical argument that using a transform of the PDF a pair of observable values {attributable warming (AW) and effective heat content (EHC)} yields a posterior PDF for the ECS and effective ocean vertical diffusivity (K(v)), which can be reduced to a posterior PDF for ECS alone; and that that PDF is identical to the one obtained using Jeffrey’s prior on ECS and K(v) in the model directly. This argument loses substantial force because, first, he used Jeffrey’s prior to obtain the initial PDF, and second, the equivalence of PDFs under such transformations is a general property of Jeffrey’s prior, and indeed the property it was designed for so that his result is a mathematical truism given his initial use of Jeffrey’s prior. (That at least, is how I understand the situation.)

The argument retains some force in that he claims the PDF for AW and EHC is obvious, ie, not at all contentious. If that is so, then his argument is physical and has some force. WebHubTelescope has disputed the validity of his model above, but as far as I can see, Lewis has merely used the same model as Frame et al so I am not sure that WHT is correct.

I would certainly like to see more discussion of this aspect of Lewis’s paper, to which, however, I can contribute little but (hopefully) will learn much from it.

88. Tom, you’re quite right that I was conflating two distinct uses of “effective”, I’ll stop doing that.

changes in temperature will occur until the resulting change in OLR matches the initial perturbation, that is until
3) ΔQ=αΔT

How do you show that 3) is the criterion for equilibrium?

89. Tom Curtis says:

Vaughn, the equilibrium state we are talking is the radiative equilibrium state. That is, it is the state where upwards(SWR+LWR) = downwards(SWR+LWR). We divide factors that effect thta energy balance into two categories – those that (at a reasonable approximation for the timescale under consideration) are not effected by temperature, and those that are. The former we call forcings, and the later we call feedbacks. Because feedbacks are governed by temperature, they can be modelled as a factor on temperature. Now, suppose from a state of equilibrium, we add a constant forcing, ΔQ. That alters the energy balance at the TOA so that net upward radiation =/= net downward radiation by definition, and equilibrium is restored when that energy balance is restored by definition. As we have divided the factors that effect the radiative feedbacks into forcings and feedbacks, and are holding the change of forcing constant, then radiative balance is only restored the change to net radiation at the TOA equals the change in forcing, ie,
ΔQ=αΔT.

You have to focus on the fact that αΔT is the change in the TOA energy balance due to a change in temperature at a given time, and all else follows.

90. @TC: You have to focus on the fact that αΔT is the change in the TOA energy balance due to a change in temperature at a given time, and all else follows.

Your explanation is the reverse of Anders’ back at August 2, 2014 at 9:01 pm. Anders argued that ΔF = 0 without mentioning α, from which one can infer ΔQ = αΔT. You’re inferring the former from the latter. I understand Anders’ direction but not the reverse. All the individual statements in your explanation are true, what I’m having difficulty with is which implies which.

I was fine with Anders’ explanation assuming one notion of ΔQ and ΔT, but complained at the time that when not in equilibrium they would need a different notion in order for the expression for the climate feedback parameter to make sense.

In the meantime I think I see a uniform interpretation of all three of ΔQ, ΔF, and ΔT so that they make sense even when the system is not in equilibrium. It requires that all three vary continuously with time. (For reasons I don’t understand people seem reluctant to do this, instead looking only at their before and after values.) Here’s my continuous version.

In Anders’ equilibrium-to-equilibrium passage, in the beginning (before time 0) all three of ΔQ, ΔF, and ΔT are zero. At time 0, ΔQ becomes positive and stays at that value thereafter, which is to say, a step function. (The step requirement can be weakened, see below.) At the same time ΔF steps up to the same value as ΔQ, but ΔT remains zero. But then ΔF starts to decline while ΔT starts to rise, maintaining the sum ΔF + αΔT = ΔQ at all times.

Eventually ΔF declines to zero, which we define as the new equilibrium. In the new equilibrium we now have ΔT = ΔQ/α, both nonzero, and ΔF = 0. Taking λ = 1/α is fine.

A weaker requirement would be that ΔQ rise to its final value over time, i.e. without having to step up to that value at time 0, but still retaining ΔF + αΔT = ΔQ throughout. ΔT would rise monotonically and continuously, but ΔF might not decline monotonically because each increase in ΔQ might push it back up. The possible failure of monotonicity of ΔF is meaningless for those explanations that consider only the beginning and the end but not the middle.

This is the first explanation I’ve seen that fills in enough details to feel like a physical explanation of what’s going on. However I worry that others will be bothered by my requirement that ΔF + αΔT = ΔQ throughout nonequilibrium, since that was not a requirement in the explanations to date. What I don’t see is how to dispense with the middle while remaining physical.

91. The rationale for why I dispute the model is nothing out of the ordinary.

The main premise behind the over-simplified model is that there exists a first-order kinetics in the heat uptake of the ocean. First-order kinetics is essentially a damped exponential time constant describing the uptake of heat. The problem with this approach is that the physics is not first-order but is second-order and better described by a thermal diffusion model. The solution then is to use a formulation derived from a heat equation model whereby the second-order dynamics are retained.

The salient characeteristic of a second-order model is that it will show a quick uptake of heat and then a gradual tail, which is where the fat-tail description comes in. The problem with trying to describe a fat-tail phenomena with a single first-order time constant is which part of the curve do you apply it to?

If you look at what Isaac Held is doing, which is described in a fresh post from yesterday
http://www.gfdl.noaa.gov/blog/isaac-held/2014/08/02/49-volcanoes-and-the-transient-climate-response-part-i/
he is trying to fit a response curve with two exponentials, one to get the fast response and another to do the slow response.

This really does not work that well because the quick uptake slope is tough to quantify with an exponential time constant. It is more hyperbolic than exponential is the way I would describe it.

In such a case the interpretation of the response curve is what adds all the uncertainty. So if one chooses to pick a certain part of the hyperbolic profile you can get whatever time constant you want and therefore extract any value of TCR from the dynamics that you want.

My problem may be that that I come from the semiconductor physics world, where when you apply the physics to materials process one has to characterize the behaviors precisely or you don’t have a repeatable process. For example, it makes no sense to use first-order kinetics when trying to describe the dopant diffusion into a material or to diffusional oxide growth. Even plain old Fick’s law is better than first-order kinetics.

Maybe I am just tainted and don’t see the broad simplifications going on here. But I do see players such as Nic Lewis trying to spin the models for all they are worth, while it is obvious that the land is warming much faster than a 1 C per doubling of CO2 transient rate.

92. VP,

On the contrary, the terms ΔQ and ΔT in the expression (ΔF − ΔQ)/Δ for the climate feedback parameter α mean something different from the ones you’re using here to show that ΔF = 0. Otherwise ΔF would always be zero.

I don’t quite follow you here.

I’m a little worried that I’m missing something obvious here, but there is possibly an even more physical way to express this. The term $\Delta Q$ is the change in anthropogenic radiative forcing. Consider a system that starts in equilibrium, and then experiences an increase in anthropogenic radiative forcing of magnitude $\Delta Q$ and an increase in temperature of $\Delta T$. We’d also expect some kind of feedback response which we can call $\Delta Q_{feed} = W_{feed} \Delta T$.

The temperature change will produce an increase in outgoing flux of

$4 \epsilon \sigma T^3 \Delta T$,

where $\epsilon$ takes into account that we’re using surface temperature and only a certain fraction escapes (about 60 %). The energy balance equation that we can now write is

$\Delta F = \Delta Q - 4 \epsilon \sigma T^3 \Delta T + W_{feed} \Delta T$

which we can simplify to

$\Delta F = \Delta Q - (4 \epsilon \sigma T^3 - W_{feed}) \Delta T = \Delta Q - \alpha \Delta T$

In equilibrium the two terms on the right hand side should balance (the temperature should have risen to balance the change in anthropogenic forcing and the resulting feedbacks). Out of equilibrium the difference between the two terms on the right hand side represents the system heat uptake rate at whatever time is being considered.

As far as Roger Pielke Sr goes, in this Climate Etc post he assumes that $\Delta F$ is the average rate at which the system has accrued energy, rather than the current rate at which it is accruing energy.

93. Tom Curtis says:

Vaughan, I am unsure from where the confusion arises. We have only two equations essential to resolving this issue:

a) ΔF=ΔQ-αΔT
which is derived straightforwardly from the IPCC definition, and
b) ΔQ=αΔT
which holds at equilibrium by definition of radiative equilibrium. Specifically, at equilibrium the perturbations in net downward radiative flux at the TOA due to factors not controlled by temperature (ΔQ) exactly balance the perturbations to net downward radiative flux at the TOA due to reaching the equilibrium temperature (αΔT). Equation (b) can obtain temporarily in non-equilibrium conditions due to the different relaxation rates of different feedbacks, but must obtain at radiative equilibrium by definition of radiative equilibrium. From (a) and (b) it follows that at radiative equilibrium, ΔF=0. From that and the definition of the climate sensitivity parameter λ;
c) λ= ΔT/ΔQ
we then have that in radiative equilibrium λ=1/α

If that is not enough, I can only refer you to Gregory et al (2002) who expound the maths albeit using λ for α, and using ocean heat flux rather than total system heat flux as I do.
The difference can approximately be corrected by scaling α. In quoting Gregory et al, I have substituted α for λ for consistency in symbol use with the above discussion.

“GCMs indicate that the increase in global-average outgoing radiative flux when the climate is perturbed from a steady state is proportional to the global-average surface temperature change ΔT. During time-dependent climate change, the imbalance between the imposed radiative forcing Q and the radiative response αΔT, being a constant, is absorbed by the heat capacity of the system, which resides overwhelmingly in the ocean (Levitus et al., 2001). Hence
F(t) = Q(t) – αΔT(t) (1)
where t is time and F is the heat flux into the ocean. Equation 1 has often been employed as the basis for energy-balance climate models.

In the unperturbed steady-state climate, Q = F = 0 and ΔT = 0. If Q is raised from zero to
some positive value, F becomes positive, additional heat is stored in the ocean, and ΔT rises. If Q then remains constant, F returns to zero over time, as the climate approaches a new steady state in which ΔT = Q/α. From its definition, the equilibrium climate sensitivity ΔT2x = Q2x/α, where Q2x is the forcing that results from a doubling of the CO2 concentration.

Although it is defined in terms of a steady-state climate, the climate sensitivity can be estimated
from any climate state. Provided we know F, Q and T, we can calculate α from Equation 1 and hence ΔT2x (e.g. Cubasch et al., 2001).

Cubasch et al is IPCC TAR Chapter 9, the relevant section by section 9.2.1.

94. Tom Curtis says:

WHT, thanks, that is clearer and certainly makes sense. Against that argument Lewis can always argue that he was just demonstrating the difference made by a Jeffrey’s prior in somebody elses model. Instead, I suspect, will want to take the climate sensitivity range he finds very seriously and probably simply ignore your point.

95. Tom,

Instead, I suspect, will want to take the climate sensitivity range he finds very seriously and probably simply ignore your point.

96. Tom, thanks very much for the pointer to Gregory et al 2002, which adds to your account precisely what I was asking for. I’d downloaded it back in 2010 (because it had “observational” in the title when I was first getting interested in observationally based climate sensitivity) but in the meantime had forgotten all about it.

The part

Hence
F(t) = Q(t) – αΔT(t) (1)
where t is time and F is the heat flux into the ocean. Equation 1 has often been employed as the basis for energy-balance climate models.
In the unperturbed steady-state climate, Q = F = 0 and ΔT = 0. If Q is raised from zero to some positive value, F becomes positive, additional heat is stored in the ocean, and ΔT rises. If Q then remains constant, F returns to zero over time, as the climate approaches a new steady state in which ΔT = Q/λ. From its definition, the equilibrium climate sensitivity ΔT2x = Q2x/λ, where Q2x is the forcing that results from a doubling of the CO2 concentration.

is practically word for word what I said was missing from both Anders’ explanation and yours. Their notation F(t), Q(t), αΔT(t) is what I meant by “all three vary continuously with time”, and their description of how the three evolve to the next equilibrium state agrees with what I said I wanted in all details. (I’d also been wondering how models could use ΔF = ΔQ – αΔT without having it hold throughout the modelling so it’s a relief to know that that’s what they do.)

One difference from the IPCC’s version of the equation is that Gregory et al omit Δ on F and Q. However if one takes ΔF(t) to mean F(t) − F(0) and similarly for Q and T, their assumption of F(0) = Q(0) = 0 makes ΔF(t) = F(t) and ΔQ(t) = Q(t), so this is no difference at all. I’d considered dropping some Δ’s myself in what I wrote above but decided that Q(0) = 0 was unphysical (what if one wanted to compose two such warming scenarios one after the other?) and that it was preferable to leave Δ on Q.

F(0) = 0 on the other hand is fine since we’re assuming equilibrium at the start and end of any such warming scenario, so leaving Δ off F is ok. (You were claiming that F = 0 at equilibrium could be derived from the formula but I couldn’t see how given that F = 0 seemed to be part of the definition of equilibrium; moreover Anders didn’t make that claim.)

Leaving Δ on F does however have the advantage of allowing the equation to be written as

Δ(F + αT – Q) = 0

meaning that F + αT – Q is invariant, i.e. has the same fixed value F(t0) + αT(t0) – Q(t0) throughout where the fixed reference time t0 can be taken to be 0 or any other time including during disequilibrium. But while mathematically correct this is physically objectionable because αT(t0) is unphysical: the assumption of linearity of the feedback is only sound for small perturbations of T. So from the standpoint of physical intuition it may be preferable not to mess with ΔT like that.

Hopefully this puts us on the same page regarding how to interpret the definition α = (ΔQ − ΔF)/ΔT of the climate feedback parameter. My apologies for taking longer than reasonable to formulate my concerns precisely.

97. VP,

is practically word for word what I said was missing from both Anders’ explanation and yours.

Other than not being precisely the same words, I’m not sure how this differs from what I was saying. I think it’s pretty much exactly what I was saying.

In an earlier comment, you said,

I don’t see why ΔF should suddenly vanish at equilibrium. A change in forcing should result in a change in ocean energy flux, whether at equilibrium or not.

Do you agree that this is wrong or, if not wrong, rather confused – I guess $\Delta F$ could refer to the difference between the system heat uptake rate when the system is in equilibrium and the rate at some earlier time when it wasn’t, but I’d assumed that we were always working from an initial time when the system was in equilibrium.

Also,

(I’d also been wondering how models could use ΔF = ΔQ – αΔT without having it hold throughout the modelling so it’s a relief to know that that’s what they do.)

Yes, I thought I’d pointed this out many comments ago and was surprised by your suggestion that this wasn’t the case.

98. @ATTP: I don’t quite follow you here.

The two meanings for ΔX(t) that I’d been considering were X(t) − X(0) and X(t) − X(t − 1) (t in units of years say), for X = Q, T, and F. The second meaning has the advantage of not requiring starting in equilibrium, but implies that ΔQ and ΔT both tend to zero in the approach to equilibrium, without however compromising the definition of α AFAICT. However no one else interprets Δ that way in this context, so I’m happy to forget about the second meaning and stick to the first.

Your notion of W_feed is what I’d call all the non-Planck climate feedbacks, since you added (or subtracted) the Planck feedback to obtain what you identified with the climate feedback parameter α. I don’t have any problem with that.

As far as Roger Pielke Sr goes, in this Climate Etc post

Thanks for the pointer. I haven’t been keeping up with CE lately.

he assumes that ΔF is the average rate at which the system has accrued energy, rather than the current rate at which it is accruing energy.

I don’t see the problem. If I were tracking the current rate at which “the system” (which everyone seems to agree means in practice “the ocean” at least for recent climate) is accruing energy I’d average it annually too. How could that introduce a significant error?

99. VP,

I don’t see the problem. If I were tracking the current rate at which “the system” (which everyone seems to agree means in practice “the ocean” at least for recent climate) is accruing energy I’d average it annually too.

No, he didn’t determine an annual average – which would be fine – he determined the average rate over the entire time interval (i.e., the average from t = 0, till t = t). In truth, this doesn’t introduce a huge error since – for the time interval he considered – the average rate over the entire time interval was only about a factor of 2 smaller than the current rate, but it’s still not technically correct.

100. @ATTP: he determined the average rate over the entire time interval

Aha, you’re right. They say “the difference in ocean heat content at two different time periods largely accounts for the global average radiative imbalance over that time.” Yes, using the endpoints will indeed give the average flux over the whole period. I presume their interest in that method is that heat increase over the period in question is large enough to be estimated to better precision than the methods they’re complaining about.

Although they talk about several time periods it looks like the one they’re primarily interested in is 1955 to 2010, is that your understanding?

the average rate over the entire time interval was only about a factor of 2 smaller than the current rate

That would make sense if the flux ramped up linearly from zero at the start of the interval of interest to its current value. Did it?

Given a rough idea of the shape of the ramp in F over that period, could one infer F(t) from its average? If so perhaps there is an appropriate correction to their result yielding a more accurate answer.

101. VP,

Although they talk about several time periods it looks like the one they’re primarily interested in is 1955 to 2010, is that your understanding?

Well, I think that is because the OHC data only really exists from about 1955.

That would make sense if the flux ramped up linearly from zero at the start of the interval of interest to its current value. Did it?

I just meant that it happened to differ by about a factor of 2 (Pielke Sr calculated 0.27 W/m^2 when it’s probably more like 0.5 – 0.6 W/m^2). It’s just coincidental, I think.

Given a rough idea of the shape of the ramp in F over that period, could one infer F(t) from its average? If so perhaps there is an appropriate correction to their result yielding a more accurate answer.

I don’t think you can infer the current system heat uptake rate from some average over a much longer time interval. Initially the average might be lower than the current value, but as the system approaches equilibrium, the average would have to exceed the current value (i.e., the current value would tend to zero, while the average would not). The correction to their result would seem to be simply doing it properly.

102. The correction to their result would seem to be simply doing it properly.

I thought the point of their post was that there wasn’t a “proper” method. Are you saying their post is a straw man and that the existing approach(es) already achieve the accuracy they’re aiming for?

103. Tom Curtis says:

Vaughan, you say that Δ(F + αT – Q) is invariant. That is correct by definition, but as “α” is defined for effective climate sensitivity rather than equilibrium climate sensitivity only because α is not actually a constant. Specifically, if you determine α for T1 to T3 and for T2 to T3 where T1=/=T2, you may well get different values for α. If your experimental precision were precise enough (which it currently isn’t), the difference in values could well be statistically significant. Further, even in the case where you have equilibrium at T1, T2, and T3, so that α=1/λ they may still vary if either the time intervals are very large or the difference in change in forcing between T1 and T3 and T2 and T3 is very large. In the first case that is because λ varies slightly with different continental and ice sheet configurations, and in the second case λ varies because the change in TOA flux is linear in temperature only approximately for small changes.

That is, we can treat α and λ as constants only to a reasonable approximation, and in that approximation α is known to equal1/λ only in the case where we compare transitions between equilibrium states (see the comment by Fred Moolten above. Currently estimates of α between time periods in the 19th-21st centuries are likely to underestimate λ by some small amount, although they may exceed it for some particular periods. This is a nuance that Gregory et al appear not to notice, although Otto et al (2013) which updated Gregory et al did note the potential of this effect without quantifying it.

104. @ATTP: Other than not being precisely the same words, I’m not sure how this differs from what I was saying. I think it’s pretty much exactly what I was saying.

I can certainly imagine it was what you were thinking, and perhaps didn’t say only because it went without saying, or because Gregory et al said it. If you actually said something that implied Gregory et al’s “If Q is raised from zero to some positive value, F becomes positive, additional heat is stored in the ocean, and ΔT rises. If Q then remains constant, F returns to zero over time, as the climate approaches a new steady state” then my apologies for overlooking or forgetting it. All I can recall seeing from either you or Tom were statements about what happened at the endpoints 0 and t.

Do you agree that [ΔF could be nonzero in equilibrium] is wrong

Yes. With the definition ΔX(t) = X(t) – X(t-1) the answer would have been no, but I’m happy to sign on to the ΔX(t) = X(t) – X(0) point of view to avoid talking at cross purposes.

105. @TC: Specifically, if you determine α for T1 to T3 and for T2 to T3 where T1=/=T2, you may well get different values for α.

Yes, that’s perhaps a clearer way of making my point, “But while mathematically correct this is physically objectionable because αT(t0) is unphysical: the assumption of linearity of the feedback is only sound for small perturbations of T.”

As Anders pointed out, the Planck feedback is part of α and varies as the cube of T, in which case you will definitely get different α’s for T1 to T3 vs. T2 to T3 (unless the non-Planck component miraculously compensates).

This objection should hopefully not arise for the form Δ(Q − F) = αΔT.

106. David Young says:

What I find interesting is the dancing around the elephant in the room, rather like arguing about which critique of Vertebraeplasty is correct. Whether you believe Lewis or Annan and Hargreaves, who were saying this for a long time and seemed to have some trouble getting any traction, or the best statisticians, uniform priors are usually a bad idea. And this showed up it seems surprisingly frequently in climate science. I believe AR4 even recalculated an ECS estimate of Forster and Gregory using a uniform prior and the result was a higher ECS, more “in line” with other lines of evidence.

Further, it seems that a lot of top statisticians knew this all along. So why weren’t they involved? I know, lots of scientists are good “amateur” statisticians. Well if you have enough confidence in your skills to publish results based on them, you should be confident enough to ask humbly some statisticians to determine if your methods might be controversial and maybe mention that in your paper.

107. Tom Curtis says:

Vaughan, with regard to Pielke Snr’s dogs breakfast at Climate Etc, they are certainly not checking the accuracy of the method (despite their pretensions) because they do not bother comparing the mean value of ΔGAARI with the uncertainty interval for their estimate from ΔGAARF and ΔGAARFB. That is aside from the issue of their using the GAARI as determined from the ocean heat content over the full interval rather than taking the difference between the two end values (or decadal averages as a reasonable approximation).

108. Tom Curtis says:

Vaughan:

“Yes, that’s perhaps a clearer way of making my point, “But while mathematically correct this is physically objectionable because αT(t0) is unphysical: the assumption of linearity of the feedback is only sound for small perturbations of T.”

Actually I was making two points. The second one was indeed the point you made. The first one follows from the fact that not all feedbacks respond at the same rate, and means that depending on the forcing and temperature history, the Effective Climate Sensitivity may not equal the Equilibrium Climate Sensitivity. This is mentioned in the IPCC definition, and also in Otto et al (2013).

109. Tom Curtis says:

David Young:
1) If we want a minimally informative prior, and wish to interpret the result of our calculation as a Probability Density Function, the uniform prior between set intervals is the correct one. In the case were our only prior information is that temperatures rise with increased forcing and the Earth has not suffered a warm temperature feedback catastrophe, the prior should be uniform across all values from 0 to some large number such that larger numbers imply a very high risk of a warm feedback catastrophe. Only such a prior expresses ignorance about the value of the ECS within the range.

2) The use of a Jeffrey’s prior does not express prior probability in the absence of information. If it did, it would express a belief that, absent all other information the most probable ECS = 0, ie, that knowing nothing else the most reasonable estimate is that increasing the flow (or retention) of energy in a system will not effect its temperature. Rather, the Jeffrey’s prior represent a function of how much information can be gained from possible outcomes of experiments. Because Jeffrey’s prior does not express a prior probability, the use of Jeffrey’s prior in Bayes theorem does not produce a probability density function. Believing that it does so represents a fallacy of equivocation. In interpreting the outcome of using Jeffrey’s prior in Baye’s theorem, we should interpret low values as indicating either low probability, or low information content of data indicating that result, or some combination of the two.

In situations where the information content of different data outcomes is regular and approximately equal across all reasonable outcomes, Jeffrey’s prior is sufficiently well behaved that the outcome can be interpreted as a PDF without serious error, and the outcome of applying Jeffrey’s prior to a large amount of data will be a very close approximation of the outcome of a true PDF. In situations where the information content of particular outcomes varies greatly, as in Lewis’s C14 dating example, and estimates of climate sensitivity, that is not guaranteed and the outcome should not be interpreted as a PDF.

3) I agree that expert priors as suggested by Annan and Hargreaves are far superior to the minimally informative prior used by the IPCC. However, the uniform prior is a more conservative approach. It avoids overstating certainty, and also avoids any issues with confirmation bias (a potential problem in the use of expert priors). Further, as the IPCC is supposed to reflect the consensus view, if the majority of the climate sensitivity experts are not convinced about the suitability of expert priors, the use of a uniform prior (as the more conservative approach) is justified providing we are clear the resulting estimates of climate sensitivity are conservative estimates, and may differ from best estimates.

4) Whatever prior the IPCC uses, it should recalculate all observational estimates using the same prior so that we do not attribute to differences in data differences in climate sensitivity estimates that arise from choices of prior. There is, however, something to be said for recalculating all estimates using both the conservative (uniform) prior and an expert prior so that the difference in choice of prior is more visible, and so that policy makers can decide whether to use conservative or “best” estimates for themselves.

110. Perhaps, Tom, but let’s not forget the elephant David wants to peddle into the room.

111. Tom Curtis says:

willard, what I am not forgetting is that David want’s to peddle a trunk, and call it an elephant. Me, I’ld rather keep an eye on the whole beast.

112. Indeed we can, Tom. To that effect, it’s always wise to present David with a guy who has a good understanding of stats:

Statistical inference procedures are ultimately justified as mathematical and computational formalizations of common sense reasoning. We use them because unaided common sense tends to make errors, or have difficulty in processing large amounts of information, just as we use formal methods for doing arithmetic because guessing numbers by eye or counting on our fingers is error prone, and is anyway infeasible for large numbers. So the ultimate way of judging the validity of statistical methods is to apply them in relatively simple contexts (such as this) and check whether the results stand up to well-considered common sense scrutiny. In this example, Jeffreys’ prior fails this test spectacularly.

My emphases. Radford Neal is this guy:

Nic should have paid due diligence to his posterior before Neal’s kick.

113. Fred Moolten says:

As noted in some of the preceding discussion, an important issue in climate sensitivity estimates is the time-varying nature of climate responses to an imposed perturbation (e.g., greenhouse gas forcing). This bedevils attempts to estimate “equilibrium” sensitivity (ECS) from estimates of the feedback parameter α calculated from data obtained under non-equilibrium conditions using energy balance models such as the one discussed in Nic Lewis’s paper. Most GCMs that attempt to evaluate this time variation limit their analyses to what are generally considered short term (“Charney”) feedbacks and their secondary effects on ocean and atmospheric circulation patterns. It is generally assumed that additional very long term feedbacks involving ice sheets, dust, vegetation, and the carbon cycle – all largely excluded from these standard ECS estimates – also contribute to equilibrium temperature responses, and so the actual responses would be substantially greater than what is conventionally subsumed within the definition of ECS. When these additional responses are added in, typical estimates of equilibrium temperature change tend to range around a value of about 6 C for doubled CO2. These estimates of so-called “earth system sensitivity” are judged irrelevant to contemporary concerns because they involve changes on millennial rather than decadal or centennial scales or more practical interest.
.
Recently, however, a number of authors have suggested that the dichotomy between short term and long term responses is artificial, and that the time scale of climate responses should be considered a continuum. In this regard, a recent paper in J. Climate is one I found quite interesting. I cite it here for its relevant to the discussion, but it might even warrant a new post of its own. The link is Rypdal and Rypdal, July 2014. The main point of the paper is that the existence of a long term memory of previous forcing, embodied in the dynamics of ocean heat exchange between the upper and deeper oceans, strongly influences the temperature trajectory and new forcing will impose. Thus the residual effects of forcing from recent centuries are predicted to enhance any new responses to CO2 increases of the kind evaluated in TCR estimates. An example is a TCR estimate based on forcing history up to about 1880 that yields values in the range of 1.0 to 1.5 C, whereas the same TCR starting in 2010 is estimated at 2.1 C. The paper should be consulted for the mathematical details and input data.

114. VP,

I thought the point of their post was that there wasn’t a “proper” method. Are you saying their post is a straw man and that the existing approach(es) already achieve the accuracy they’re aiming for?

No, I just meant that if you are going to follow an energy balance model approach, then do it properly. Pielke Sr. could quite easily have calculated the change in system heat uptake rate, rather than the average over the entire time interval. In truth, it may not have been that easy as he would have had to estimate the rate in 1955, but he could probably have come up with a sensible number.

Yes. With the definition ΔX(t) = X(t) – X(t-1) the answer would have been no, but I’m happy to sign on to the ΔX(t) = X(t) – X(0) point of view to avoid talking at cross purposes.

Okay, I’ll acknowledge that I hadn’t realised that you were using $\Delta X(t) = X(t) - X(t-1)$, rather than $\Delta X(t) = X(t) - X(0)$. Given this, I now understand more what you were suggesting.

115. Tom Curtis says:

Vaughan, further to my post @11:52PM, you may want to look up Armour et al (2013). They start by defining λ(eq) (= -λ in the notation we have used above), and λ(eff) (= -1/α in the notation above). In doing so they elucidate the relative relationships far more clearly than any other single source I have come across. (I am beginning to suspect, however, that if you get three climatologists in a bar, they will have four distinct notations for treating climate sensitivity.)

More importantly, they then go onto discuss time variance of λ(eff) based on a particular model run, ie, the issue I discuss in my post at 11:52PM. Figure 1 is of particular interest, showing that the effective climate sensitivity to a doubled CO2 varies from 6-16% below the equilibrium climate sensitivity, and does not rise above the 6% below even 300 years after the initial doubling of CO2. These results are for a single model run of a single model, and so we cannot generalize that 6% figure; but it does show the effect to be a relevant factor and simple energy balance calculations of the EFS to underestimate ECS a sufficient amount that they are likely to effect the reported results.

116. Richard S.J. Tol says:

@Wotts
Degeneracy means that, at first sight, you seem to have N-dimensional data whereas, on closer inspection, you have only N-M dimensions (because there are relationship between the data).

If an N-dimensional Jeffreys’ Prior is applied to N-M-dimensional data, things go horribly wrong. You can’t hold that against Jeffreys though.

117. Richard,
I know what degeneracy means in general, I was meaning in this specific example. Also, to be clear, I’m not holding anything against Jeffrey’s prior specifically. I’m sure it’s a perfectly nice prior, it just doesn’t appear to always be appropriate or optimal.

118. Eli Rabett says:

Wrt the two exponential model that WEB mentions above, there is a superior model called the stretched exponential which is a much more concise and physically understandable description of long tailed distributions such as relaxation in glasses.

119. Given a choice between priors, my choice would go to Richard. The only problem his that this prior is spelled with a y. If I keep to i’s, perhaps I’d pick Arthur, someone Vaughan might have a chance to know:

http://en.m.wikipedia.org/wiki/Arthur_Prior

Covariance is a nice property to have, Richard. It’s just not worth sacrificing common sense. It’s the posterior that matters anyway.

120. David Young says:

Tom, I do agree with you that the IPCC should calculate these things with at least a couple of priors to show the sensitivity of the results to these “preconceptions.” That’s a sign I look for in a paper that gives it credibility, viz., does it look at how sensitive the result is to methodological changes. Sometimes, fluid dynamics simulations are quite sensitive to these details and that tells you the result is not very reliable.

121. Marco says:

Am I far off if I summarise this thread as:
“Nic Lewis publishes paper that criticises an old paper (and AR4) for doing something he then does himself – cherry-picking a bayesian prior without showing it is appropriate” ?

122. Marco,
Seems reasonably fair 🙂

I will add that Jeffrey’s prior doesn’t produce unreasonable results, there just seems to be no real argument as to why it is a prior that should be used in preference to more expert priors.

123. My preference is to leave out totally the PDF and and present the results stating very clearly that they represent likelihoods, i.e. they tell in relative terms, how strongly the empirical data favors certain values over others. The curves of the summarizing graphs of IPCC do exactly that (the bars cannot be interpreted in this way).

When the results are presented against another variable on the x-axis, the likelihoods are not changed. Thus the visual impression can be changed by a non-linear transformation of the x-axis. When the graphics is interpreted as PDF, each choice of the x-axis corresponds to some prior, the figure of IPCC corresponds to uniform prior in S, making the spacing of successive values of S the smaller the larger S corresponds to cutoff at high S.

What the empirical data tells is strictly the likelihoods, going beyond brings in other arguments.

124. Pekka,

125. Some more points about the curves.

When a group of curves is based on the same empirical data, they should be interpreted as alternative analyses. If one of the analyses is superior in quality, the others can be dismissed, in other cases the best estimate might be some kind of average.

When curves are based on totally independent data, the values can be multiplied and the resulting curve is likely to give more restrictive evidence on the value of S. This is exactly, what Bayes tells on combining correctly independent evidence.

Curves with a very long upper tail indicate that the particular method cannot restrict the range of S from above. It does not tell that high values are likely, it just cannot exclude them. This is the case for part of the instrumental estimates of ECS.

What I do not like at all is that the y-axis is labeled “Probability / Relative Frequency” as none of the empirical methods can tell that.

126. Richard S.J. Tol says:

@Wotts
Jeffrey’s Prior is not appropriate or optimal. It is not intended to be. It is non-informative, in a precisely defined way.

127. It’s non-informative in a precisely defined way, when the measure of the space is given. For a different measure a different Jeffreys’ prior is obtained. The dependence on the measure is totally essential, much of the disagreement on the prior can be translated to a disagreement on the measure. Therefore no single Jeffreys’ prior is of significance without good arguments to tell that the measure used is appropriate for the application.

128. Richard,

Jeffrey’s Prior is not appropriate or optimal.

Maybe you should point that out to Nic Lewis. He seems to think that it is.