** Update (02/04/2019):** *I’ve updated this in a new post. The updated result suggests a slightly lower climate sensitivity and a narrower range. The main difference is – I think – how I was handling the forcing uncertainty. In this post, I was simply using some fraction of the total forcing, while a more appropriate thing to do is to use the aerosol forcing, which is what I’ve done in the updated analysis.*

I’ve been spending some time working on a Bayesian estimate for climate sensitivity. This is somewhat preliminary and a bit simplistic, but I thought I would post what I’ve done. Essentially, I’ve used the Markov Chain Monte Carlo method to fit a simple climate model to both the surface temperature data and to the ocean heat content data.

Specifically, I’m using a simple two-box model which can be written as

In the above, is the heat capacity of the upper box (ocean mixed layer and atmosphere), is the temperature of this box, is the heat capacity of the lower box (deep ocean), and is this box’s temperature. The term is essentially climate sensitivity, determines the exchange of energy between the two boxes, and is a noise term that I’ve added.

In the above equations, is the radiative forcing. Unfortunately, I can’t seem to work out where I got this data from, but I will update this when I remember. Any forcing dataset would work, though. The term in the top equation is the global surface temperature anomaly. I used the Cowtan and Way data, which you can access here. To complete this, I also needed a ocean heat content dataset. Laure Zanna very kindly sent the data from her recent paper, which can also be downloaded from here.

A couple of other things. I couldn’t find a forcing dataset that included uncertainties, so I assumed a 1 uncertainty of 25%. I also initially had trouble getting a decent fit between the model and the temperature and ocean heat content data, so have increased these uncertainties a little.

To actually carry out the fit, I used a python package called emcee. It’s well tested, quite commonly used in astronomy, and is what I used for the paper I discussed in this post. The model has 5 parameters: , , , , and . The priors for and were uniform in log space, while all the others were simply uniform.

The term is essentially an initial value for the deep ocean temperature, relative to the global surface temperature anomaly. I also adjust so that is the total heat capacity of the ocean down to 2000m and the fit is based the 0-2000m ocean heat content matching the combined heat content of the upper and lower boxes.

The figure on the top right shows the resulting fit to the global surface temperature anomaly. The orange line is the median result, while the lighter gray lines sample the range. The figure on the bottom right is the resulting fit to the 0-2000m ocean heat content data. The orange and gray lines are also the median result and a sampling of the range.

The figure below shows the resulting distributions for the 5 parameters. As will be discussed below, these can then be used to determine the equilibrium climate sensitivity (ECS) and the transient climate response (TCR).

The equilibrium climate sensitivity is simply given by the change in forcing due to a doubling of atmospheric CO_{2} divided by (i.e., ), while the transient climate response (TCR) can be determined using that the TCR-to-ECS ratio is

The resulting ECS distribution is shown in the top figure on the right, while the resulting TCR distribution is shown in the lower figure.

The table below shows the 15th percentile, median, and 84th percentile for the ECS, TCR, TCR-to-ECS ratio, and distributions. The results for the ECS and TCR are reasonably similar to what’s presented by the IPCC (although the lower-limit for the TCR is a bit higher: ~1.5K, rather than ~1K). The only term that may not be clear is , the heat capacity of the upper box. A value of is equivalent to an ocean depth of 100m. The values I get seem a little high but may not be unreasonable (I was expecting this box to have a heat capacity equivalent to an ocean depth of about 75m).

Parameter | 15th percentile | median | 84th percential |
---|---|---|---|

ECS (K) | 1.99 | 2.65 | 3.63 |

TCR (K) | 1.55 | 1.94 | 2.44 |

TCR-to-ECS ratio | 0.65 | 0.73 | 0.80 |

2.77 | 3.57 | 4.47 |

Anyway, that’s what I’ve been working on. There may be more that I could do, but I’ve probably spent enough time on this, so will probably leave it at this. I did find it interesting that a relatively basic analysis using a very simple model produces results that seem entirely consistent with much more complex analyses and that is also consistent with various other lines of evidence.

I did try various ways to carry out this analysis. The results were all consistent with what I presented here. In some cases, however, the median climate sensitivity estimates were actually higher. However, in these cases, the fits between the model and the data seemed poorer. However, in none of my analyses did I recover climate sensitivities that were substantively lower than what I’ve presented here.

This would make a wonderful project for an introductory graduate class on climate. Would you be willing to share the ingredients?

Scott,

Yes, of course. I’ll put something together and send them to you in an email.

This appears to fit well with the work of Mauritsen & Pincus 2017 https://www.nature.com/articles/nclimate3357

However, they incorporate the uncertainty of total aerosol forcing to produce an effective range of TCR.

Question: How does the range of potential aerosol forcing values influence your derived ECS/TCR. Alternatively, does the use of ocean heat content trends as a proxy for Top of Atmosphere radiative forcing obviate the need for aerosol forcing in your methodology?

jai,

An increased forcing uncertainty would probably not change the median, but would increase the range (i.e., the 15th percentile would be smaller and the 84th would be bigger).

This is, of course, way over my head. But I’m interested to see who’s the first ‘skeptic’ to come along and point out where you’ve gone wrong, aTTP.

Nice work

“This would make a wonderful project for an introductory graduate class on climate. Would you be willing to share the ingredients?”

It is also a good execerise to show illustrate the modelling approach.

Finally, it would be cool to show people what uncertainties are most important. ( if any)

and why its so hard to come up with a tighter final answer

Steven,

I hadn’t thought about seeing which uncertainties play the biggest role in setting the uncertainties in climate sensitivity. It’s an interesting idea. Something I should probably have made clearer in the post, is that the ECS is essentially the effective climate sensitivity since I’m assuming is constant. That’s an additional uncertainty that this approach can’t address.

john,

“Skeptics” have seemed to be a bit quieter recently. Maybe it’s getting more difficult to promote that narrative?

Agreed. I guess the evidence is piling up.

“Steven,

I hadn’t thought about seeing which uncertainties play the biggest role in setting the uncertainties in climate sensitivity. It’s an interesting idea. ”

usually when I would do these kinds of models it would be with a view toward “which unknown do we have to work on” so if you had a limited budget and 1 or 2 things you could afford to investigate deeper, you’d work that way.

just a thot

Interesting that linear EBM, like L&C 2018, tend to undershoot other methods of fitting observations and forcing. My swag: aerosols and the deep oceans both slow warming and distort feedbacks, so a linear fit, that gives older observations more weight, underestimates sensitivity.

On a different note, per the 2018 paper below, aerosol impacts depend strongly on where the emissions originate. So aerosols are going to confound any simple fitting method that uses global average forcing.

https://www.nature.com/articles/s41467-018-05838-6

One thing to think about: my favorite test for this kind of approach is to use the method on an example where we know the answer – e.g., a climate model. Given 20th century forcing & temperatures from a number of models, do the distributions of estimated sensitivities match the expected pdf from your models? E.g., as many above the median as below, 2/3rds of the models within the 1 std dev range, 1/3rd outside that, etc. But I don’t know how easy it is to get that data…

climatemusings,

I suspect one issue there would be that the range from this analysis is probably due to different factors than the range across climate models. This is mostly uncertainty in the forcings and in the temperature and OHC datasets. Across climate models it’s probably more to do with how they parametrise things like clouds. It’s not clear that it would actually be an apples-to-apples type of comparison.

My swag: aerosols and the deep oceans both slow warming and distort feedbacks, so a linear fit, that gives older observations more weight, underestimates sensitivity.Low climate sensitivity is politics. It is advanced to influence policy by lobbyists.

reminds me of a discussion about Durack et. al. 2014 that found severe underestimates of southern hemisphere forcing (ocean heat content increase). https://forum.arctic-sea-ice.net/index.php/topic,1011.msg38424.html#msg38424

The issue here is that if the aerosol forcing is (was) underestimated and this underestimate is gleaned by the observation of much greater southern hemisphere forcing AND we are using total ocean heat content as a proxy for top of atmosphere forcing THEN any observational methods to discern TCR Will be underestimated as the increased forcing in the southern hemisphere from a lack of aerosol loading will provide a false total global forcing value (high) and the temperature response of the land masses of the northern hemisphere will be suppressed due to the higher aerosol loading.

Note that in the post at that time it was not clear how much more effective aerosol loading is in the arctic (now determined to be near the theoretical maximum. see: https://agupubs.onlinelibrary.wiley.com/doi/full/10.1002/2017GL075795

At that time the implications for the Durack reanalysis showed a likely ECS between 3C and 5.5C with a median estimate of 4C for 2XCO2.

as an aside to your response, as you probably know the Mauritsen & Pincus estimates had quite a fat tail due to the areosol uncertainty with higher (stronger negative) values of aerosol forcing leading to even greater fat tails of distribution.

They also show that the median value absolutely drifts toward higher values with stronger aerosol forcing.

With recent (excellent) work done by Rosenfeld using satellite data to determine aerosol cloud properties and the total cloud aerosol forcing returning values TWICE AS STRONG as those used in CMIP5, The TCR based on Mauritsen & Pincus move upwards of 3C (median).

see: https://forum.arctic-sea-ice.net/index.php/topic,1384.msg187032/topicseen.html#msg187032

John,

Except a TCR of 3C (median) seems implausibly high. There are some criticising the CMIP6 models since they seem to have higher ECS estimates than the CMIP5 models and the suggestion is that it’s because the aerosol forcing is too high.

Allow me to display my florid ignorance of Bayesian stats.

Does the use of uniform priors not essentially turn this into a frequentist analysis?

Would the use of a prior based on some orthogonal evidence (paleo, for instance) not be a more meaningful methodology?

I look forward to my coming humbling.

vtg,

No, I don’t think that using uniform priors turns a Bayesian analysis into a frequentist analysis. Ideally someone like Dikran will chip in, but even using uniform priors, a Bayesian analysis can still give you the probability distribution for the model parameters. On the other hand, if you were doing some kind of frequentist analysis (say chi-squared) you could determine some kind of best-fit model, but I don’t think you could produce some probability distribution of the model parameters. At least, I think that’s right. Maybe someone who can explain, and who understands, statistics better than me can clarify.

If you use a uniform prior, then the maximum a-posterior (MAP) estimates of the model parameters are identical to the usual frequentist maximum likelihood (ML) estimates. However to be properly Bayesian, one would use the whole posterior distribution over the model parameters, rather than just the maximum, and then when you want to predict something with the model, the parameters are integrated out (marginalised), or propagated to give a posterior distribution on the thing you want to predict. The main advantage of a Bayesian approach is that it provides an elegant method to incorporate these sources of uncertainty into the output.

The other advantage being that it allows you to answer the sort of questions scientists want to ask directly, rather than a question about some fictitious population of experiments you didn’t actually perform… ;o)

The main difference between frequentist and Bayesian statistics lies in the definition of a probability. For a Bayesian it is a numerical indication of the relative plausibility of some proposition. For a frequentist, it is the long run frequency with which things happen. This means even if the probabilities are numerically equal, it doesn’t mean they are equivalent. For example the truth of some particular proposition has no long run frequency, it’s either true or it isn’t. Thus a frequentist can’t attach a probability to it, so if it looks like they have (e.g. the p-value from a null hypothesis statistical test) then it is important to remember that can’t be the case, so it pobably means something more complicated.

Essentially, frequentist statistics are easy to perform, but conceptually rather subtle and widely misunderstood (e.g. there isn’t a 95% probability that the true value lies within a particular confidence interval). Bayesian statistics are conceptually straight forward, but hard to do (because it involves integrating things), but fortunately software is helping a lot with that these days – it is much easier to do now than it was 15 years ago.

…and Then There’s Physics says:

“Skeptics” have seemed to be a bit quieter recently

Chuckle.

The OHC and the surface temp anomaly medians match very well even if the latter seems a little smoothed, more variability in the air temps I guess.

A question is since the OHC is so much bigger in absolute energy amount and the amount of change is so small and slow in terms of energy input can this possibly link so easily to the recorded or modelled surface air temperatures.

The concordance between the two suggests it is models all the way down.

angech,

I have no idea what you mean. It’s almost as if you’re arguing from incredulity.

Hi ATTP, normal thing for forcing is to use a scaling on the aerosol forcing alone if you have components for that. However since it increases more or less in proportion to the total, I suspect your approach isn’t going to be far wrong.

I’m not clear what you have done with the noise term. As written (and judging from your outputs) you may have just added a small trend which probably isn’t what you meant.

James,

When you say scaling on the aerosol forcing, do you mean for the forcing uncertainty?

For the noise term, I should probably have written where was the parameter I was marginalising over and was a random number chosen – at every step – from a Gaussian with standard deviation of 0.2.

I did some without this term (corner plot below), and the results are similar, but the median TCR value was higher (2.4K). So, I just wondered if the assumption that it was entirely forced might be an issue. If you can think of a better way of doing this, I’d be happy to try.

James,

I should probably have made clear that I didn’t include the noise term when I plotted the temperature and OHC figures.

Yes I did mean for the forcing uncertainty. I’m still not clear how you have done the noise term – this is a stochastic model, right?

James,

Thanks.

When I create the model time series, I simply do

where is simply a random number drawn from a Gaussian centered on 0 and with a standard deviation of 0.2. So, I’m essentially adding a small random number to each element of my timeseries. You’ve made me wonder if this is a sensible way to do this.

I believe it is incorrect, but I’m not sure how much it matters. MCMC does not work automatically with a stochastic model, it needs a likelihood P(obs|params). When you run a stochastic model and compare to obs, you get a different answer each time. I believe the “correct” way to do this is to run the model without the stochastic term (ie calculate the pure forced response) and account for the internal variability via the obs error covariance matrix. It happens to be exactly what I am doing right now with almost the same model 🙂

James,

Okay, that makes sense. I look forward to seeing your results 🙂

Thanks Dikran and AT

FWIW, the average temperature of the top 100m of the ocean traces out a somewhat different pattern than surface temperature. More of a stairs look, with an unusually large riser this decade.

Chubb – during this decade the surface of the Eastern Pacific has been blistered with additional SW radiation.

Interesting ATTP. As an undergraduate I was taught to determine water depth from sedimentary facies based on the concept of storm wave-base vs. fairweather wave-base, which split the ocean into three depth ranges. For laminated shales, storm wave-base was the relevant one, because they’re deposited very slowly and a storm big enough to penetrate that deep will stir up decades or centuries worth of mud. We used the 100-year wave as a guide, just because it’s convenient: metocean providers report 100-yr storms. A rule of thumb is 100m, but 50m in small or enclosed seas and up to 200m in huge oceans with a large fetch, especially in tropical cyclone latitudes.

I see from a recent paper analysing buoys that in reality it’s more of a continuum. Storm and fair-weather wave base: A relevant distinction? For the western Atlantic the P50 is about 50m, 70m is about the P20 and 100m the P15 (in the Caribbean they’re much smaller, so that part still applies). It’s a geology paper so the authors don’t care about the deep ocean (everything is below storm wave base) and the locations are all nearshore, but in hurricane paths. These are time-averaged numbers over about 13 years of observations, so if it’s big hurricanes that are the key mixer of seabed mud, I as a geologist should perhaps multiply by seven to infill the 100-year storm (i.e. only 13% of stations have experienced their hundred-year storm during their recording time). Eyeballing the P7 gives me about 110-120m! However, 100 years is perhaps a bit on the long side for ECS and infilling to a 30 or 50 year storm would bracket 100m.

If you’d asked me 40 years ago what you should use as the depth of the well mixed layer I’d have said storm wave-base, about 100m. Today, with a bunch more science, I’d still say 100m. Funny, isn’t ECS exactly like that too?

Fig. 3 of this paper is also interesting. Seasonal Meandering of the Polar Front Upstream of the Kerguelen Plateau. It’s an early outcome of a project which fits miniature ARGOs to elephant seal heads, and collects the data when they fall off annually as the animals moult (I presume they have some sort of transponder, or perhaps they just search the beaches as they presumably moult on land). The centre panels show what I presume is summer warming to a depth of about 150m. That’s due to southward migration of the Polar Front, so rather than indicating vertical mixing to that depth in months, it’s an indicator of the depth of the long-term mixed layer in the southern Indian Ocean, which is where the warm water is intruding south from. That’s probably an over-estimate of the average global well-mixed layer depth for ECS purposes, (a) because that depth is effectively steady-state, longer than 100 years and (b) because the Southern Ocean has the longest fetch and some of the biggest storms in the world.

angech means that if there is consilience between two or more empirically determined lines of evidence (a.k.a. observed datasets), then something is fishy. The fix must be in.

He may not be the most obtuse, off-topic crank, but he’s our obtuse, off-topic crank.

Good work. Nice bayesian analysis.

Lets compare your results with some similar analysis:

Your work:

TCR median is 1.9, 70% CI is [1.6,2.4]

ECS median is 2.7, 70% CI is [2.0,3.6]

Skeie et al 2018

This paper has a nice solid bayesian approach

https://www.earth-syst-dynam.net/9/879/2018/

TCR median is 1.4, 90% CI is [0.9,2.0]

EfCS median is 2.0, 90% CI is [1.2,3.1]

The paper claims that an EfCS estimate is equivalent to an ECS of 2.9.

Bruns et al 2018

This paper does an interesting multicointegration analysis. It is similar to your 2 box model.

TCR median is 1.9

ECS median is 2.8, 66% CI is [2.3,3.7]

Hebert 2017

This is a cool paper that uses the GCM distribution as a prior and treats the ocean heat capacity issue using hurst noise.

http://digitool.library.mcgill.ca/webclient/StreamGate?folder_id=0&dvs=1553731991844~534

TCR median is 1.6, 90% CI is [1.4,1.9]

ECS median is 2.4, 90% CI is [1.8,3.7]

Your results seem reasonable.

I was under the impression that the Bayesian approach gives the same results as the Frequentist approach if you use uniform priors that span the entire domain. However, not if the uniform priors have a restricted domain. Am I incorrect?

“angech, I have no idea what you mean.”I think you seriously need to consider the possibility that angech doesn’t, either.

-1,

Yes, the priors have a restricted domain. The ECS prior goes from 0.1 to 7.6, prior goes from 0.25 to 1.5, and the prior goes from 0.1 to 10.

One suggestion I would make is to use CMIP5 RCP forcing data back to 1765, or the AR5 forcing history back to 1750 if you can find that. For a couple of reasons:

1) Forcing changes prior to 1850 (both anthropogenic and natural) likely had a substantial influence on ocean heat and surface temperature warming up to the mid-20th Century.

2) At the beginning your model is effectively uninitialised for the existence of major volcanic eruptions so you get a sort of system shock effect whereby there is a permanent step-down in ocean heat content when eruption events are encountered. This doesn’t happen in reality because the system has long been equilibrated to a situation which involves occasional large blips in insolation due to volcanic eruptions. Going back to 1765 will introduce your model to some very large eruptions and provide a more realistic equilibration before you get to your test period from 1870.

Paul,

Thanks. I have actually just downloaded the CMIP RCP forcings. In a sense, the terms is allowing the model to not start in equilibrium. I’ll have to think about how I could implement a version where the model starts in 1765, but the MCMC only starts in 1870. It’s possible, but will take a bit of tweaking.

ATTP,

In a sense, the \theta_1 terms is allowing the model to not start in equilibrium.Yeah, that’s interesting. I have a 2-box type setup with more explicit fast and slow responses to instantaneous forcing change and TOA imbalance. Your approach seems freer, which is more compelling, though I guess the algebra ultimately works out similarly in the end.

I wasn’t sure how to interpret. Based on the plots above is it literally saying the model deep ocean finds an initial (at 1870) best fit at 0.25K cooler than what would be equilibrium with 1870 surface temperature? That seems like quite a big difference for the deep ocean. Doesn’t that mean the model would produce a fairly substantial increase in ocean heat content for centuries even with zero forcing?

Paul,

You also have to factor in the term (which I realise I initially called in the equations – fixed now). That comes out at around 0.5, so it implies that it’s out of balance by about 0.5 x 0.25 ~ 0.125 W/m^2.

ATTP,

a TCR of 3C is absolutely plausible if aerosol forcing (including total cloud effects) is higher. This is especially true if aerosol forcing in the Arctic is much stronger than at lower latitudes and total Summer sea ice loss would be incurred under zero anthropogenic aerosol emissions, leading to rapid albedo feedback increased warming locally.

I provided very recent studies above that both indicate that aerosol cloud forcing in the tropics is twice what was used in CMIP5 and that the effect of aerosols on Arctic clouds is close to the theoretical maximum of 2 to 8 times the effect at lower latitudes.

Simply stating that a high TCR/stronger aerosol forcing is implausible in the face of these supportive facts is not reasonable. This is especially true as this reality explains very well the observational shifts in global circulation patterns post 1978 (aerosol reductions) the negative PDO bias after rapid China industrialization (mid 1990s to 2013) as well as the economic cycle driven warming pulses (early 1920’s and early 1930’s) All of which had been previously attributed to ‘climate variability.’

John,

Except a TCR of 3K would imply and ECS of about 5K. Although not impossible, paleo-climate estimates would seem to suggest that this is very unlikely. Of course, the ESS could be as high as this, but that would only manifest on long timescales.

ATTP,

Do do a rigorous exposure to the temperature projections we are facing in the paleoclimate you would have to go back to the Pliocene. And yes, you would only be looking at ESS on those timescales. However, we do not have a paleoclimate analog to the rapidity of temperature changes we are facing, nor the accumulated land carbon pools that the last 2.8 million years have accumulated (much less the short term transitory effects of a centennial scale warming. Check the work on emerging constraints and the excellent model work being done with E3SM and CMIP6.

I have very much appreciated your thorough work over the years and I would especially like to see your perspective on https://iopscience.iop.org/article/10.1088/1748-9326/ab08a9

Especially in regard to

If ECS is no greater than 3C, if rapid fossil fuel reduction aerosols only produce 0.4W/m^2 forcing, if Arctic aerosol forcing doesnt lead to Summer sea ice loss, if we do not consider thermokarst CH4 and frozen soil CO2 emissions, if tropical peat and temperate forest carbon, and if we reach negative global carbon emissions by 2035.

Pingback: An updated Bayesian climate sensitivity estimate | …and Then There's Physics

This is very nice, ATTP. It’s been almost exactly five years since your previous post about a two-box model:

https://andthentheresphysics.wordpress.com/2014/03/21/a-two-box-model/

which found TCR and ECS of 1.92 and 2.6 C respectively … almost identical to this one. And since James A. has posted here, didn’t Hargreaves et al. 2012 find an ECS of 2.5, based on the LGM-to-present temperature change?

I am pretty solidly convinced that ECS is somewhere in the range 2.4 to 3.0, and nothing much seems to be happening in the world of climate science that would radically shift this one way or the other. As for ESS, who knows.

Ned,

Yes, I think an ECS between 2.4 and 3 is quite likely. Of course, one should bear in mind that estimates like this one assume a constant climate sensitivity () and there are indications that this may not be the case and that we might warm slightly more as we approach equilibrium, than we did earlier on. There are also other Earth System responses (permafrost release, for example) that could further amplify the warming on longer timescales.