Gavin Schmidt has a Realclimate post about a recent talk given by Steve Koonin. Somewhat bizarrely, Koonin has a response to Gavin’s post that he has posted on WUWT. Given that Koonin has no climate expertise, presumably he thinks that his status as a physics Professor gives him the credibility to speak about the topic. Bit odd that he would then post his response on a site that has none.
Like Gavin, I also watched Koonin’s talk. Gavin’s post covers most of the issues, but there were a couple of things that bugged me that I thought I would highlight here. At about 11 minutes he highlights that the increase in atmospheric CO2 has produced a change in radiative forcing of about 2.5 Wm-2 (correct). He then goes on to say that it’s a relatively small perturbation, less than 1%.
He’s made this claim before and there is already a beautiful response from Andy Lacis. The only way the perturbation can be less than 1% is if you compare it to some of the large surface fluxes. This may seem reasonable, but it’s not really. What’s more relevant is how much we’ve perturbed the natural greenhouse effect. Relative to having no atmosphere and with the same albedo, the natural greenhouse effect enhances surface temperatures by about 33K. We’ve already warmed by about 1K, most of which is probably due to our emissions. Hence, we’ve already perturbed the natural greenhouse effect by 3% and are heading towards perturbing it by 10%, or more. As any physicist should realise, perturbing a complex, non-linear system by more than 10% is unlikely to be insignificant.
At about 19 minutes he goes on about the absolute surface temperature from the models differing by as much as 3K, and suggests that – given that this is larger than the change in surface temperature – it should give us pause as to the model responses being right. This is a somewhat more complex/nuanced issue, but it’s not really a very good reason to question how the models respond to radiative perturbations. We don’t actually have a very accurate estimate for the absolute global surface temperature. As Koonin himself showed earlier, there are a number of quite large surface fluxes, the balance of which will set the equilibrium surface temperature. Since we can’t easily measure these surface fluxes, we can’t produce an accurate estimate for the absolute global surface temperature. A relatively small change in one of these surface fluxes, can change the resulting surface temperature by a few K.
What’s more relevant is whether or not we expect the response to radiative perturbations to depend very strongly on the base state. The answer is not really, and this is also largely what the models indicate. What we’re interested in is how we expect the system to change in response to our emission of greenhouse gases into the atmosphere. Given that we don’t expect this to depend very strongly on the base state, that the models don’t produce exactly the same base state doesn’t necessarily mean we should question the resulting responses. This is not to say that we should ignore this issue; it may be that the reason for the difference in the base state is that some of the models are incorrectly representing some physical processes. However, simply highlighting this difference and suggesting that it brings into question how the models respond to perturbations is a pretty weak argument, and you might expect a physicist to get the nuance.
Anyway, that’s my quick addition to the comments about Steve Koonin’s recent talk. In case it’s not clear why this is potentially interesting, Steve Koonin is one of the originators of the idea of some kind of red team exercise, in which a supposedly independent group challenge the basics of climate science. Given this presentation, I’m not convinced it would be worthwhile, other than in demonstrating that most of those associated with this red team idea don’t really understand the topic very well.
Raymond Pierrehumbert has a textbook, “Principles of Planetary Climate”. Any physicist commenting on climate ought to have read it. So ask such about some exercise from Chapter 6. That should serve as fly ointment.
My opinion is that no credible climate scientist should have their name associated with this political theater as it’s just a side show to distract attention from the massive roll back of environmental policies. Perhaps we should change tactics and adopt behavioral economics strategies like they do with inflation expectations. All we need to do is believe that temperatures and emissions are falling and we will bend reality to match our expectations just like long term interest rates and inflation. If you need proof just look at the negiative interest rates on long term debt, QED.
I was just browsing WUWT, the world’s #1 climate science website, to see what they had to say about this Red/Blue team idea but the most interesting post that day was a story about global population projections. The Id of climate change deniers on full display.
Red-team reviews are essentially the same thing as peer-review, just applied in a different setting. In the corporate setting, red-team members are your peers.
“I’m not convinced it would be worthwhile, other than in demonstrating that most of those associated with this red team idea don’t really understand the topic very well.”
Well wouldn’t that be a pretty respectable start?
Berkley Earth was, in effect, a red-team exercise on one fairly important aspect of climate science. I suggest that further red-team efforts should be supported in proportion to the climate deniers’ acceptance of the BEST results.
I believe jacksmith4tx is correct – real scientists shouldn’t participate – they don’t have real scientists!
But that doesn’t mean the debate can’t take place with well informed non-scientists.
People who are better skilled at public communication and dealing with the disingenuous debate strategies that contrarian are required to resort to, than scientists are.
“presumably he thinks that his status as a physics Professor gives him the credibility to speak about the topic.”
No doubt. Non-climatologists speaking on the topic seems fairly common, maybe even normal.
Bookmake this URL:
https://www.climatechange.ai/
They just finished a major presentation on June 14 and you can watch the presentations here:
https://slideslive.com/38917142/climate-change-how-can-ai-help
Guys, Why aren’t using A.I. to figure this out?
https://www.expresscomputer.in/artificial-intelligence-ai/ibms-ai-debating-system-argues-for-and-against-cannabis-legalisation/36905/
Jun 18, 2019,
Quote:
“Today we demonstrated our vision to develop technologies that will enhance decision making,” project manager Ranit Aharonov said. “Not machines that will make decisions, but machines that will help in the way that artificial intelligence can help.”
Would it be fair to have a A.I. program on the Blue team?
E.D. :
“Berkley Earth was, in effect, a red-team exercise on one fairly important aspect of climate science. I suggest that further red-team efforts should be supported in proportion to the climate deniers’ acceptance of the BEST results.”
One of Gavin’s major complaints is that:
“The red team issue came up a few times. Notably Koonin says at one point in the Q and A:
The reports are right. But obviously I would not be pushing a red team exercise unless I thought there were misleading crucial aspects of the reports.
55:55
But in over an hour of talking, he doesn’t ever really say what they are. Instead, there are more than a few fallacious arguments, some outright errors, some secondhand misdirection, a scattering of dubious assumptions and a couple of very odd contradictions. I cannot find a single instance of him disagreeing with an actual statement in the reports.”
Incidentally, Gavin says of Steve’s response to RC ;
“Koonin emailed an early version to us yesterday, but since I don’t check the RC email all that often, I didn’t notice it until this evening, after he’d already posted on WUWT. – gavin]”
GWPF also did a red team analysia and published a devastating report on the rampant fraud in temp record analyses.
Oh. Wait.
Well, there was Anthony’s devastating red team analysis on weather station fraud and the criminal lack of attention to the UHI effect.
Oh. Wait.
“Red-team reviews are essentially the same thing as peer-review, just applied in a different setting. In the corporate setting, red-team members are your peers.”
Usually. There were times I was called into Red team things done by other groups. Or times we would call in outside experts to review.
My attitude toward the red team is : It is always instructive to allow your opponents to give it their best shot. Then GPWF wanted to review temperatures ( especially with Roman M on the team)
I thought it would be a great idea. bring your best, come on.
Audits never end, meh.
Steven,
Indeed. I think people should be free to go ahead and challenge our current scientific understanding; it’s a cornerstone of the scientific method. Be nice if they actually went through with it, unlike the GWPF who promoted their temperature review, and then did nothing.
Koonin appears to basically be Curry. Is there anything material there that is not coming directly from her.
JCH: he uses Bob Tisdale graphs, and I don’t think Curry ever used those. Or did she? The fact that he got Watts to publish his ‘rebuttal’ (not quite the right description in this case…) suggests he gets plenty from WUWT rather than Curry.
ugh … “Blob Tisdale” has polluted Google with his meaningless graphs cluttering up search results for years now. BTW, they’re also ugly looking.
Hmm I started to watch the video.
Gavin didnt do a good job.
I n fact after watching a little it was pretty clear to me that Gavin was a pretty lazy.
better debunking please
Well, on the rare occasions when he comments at Climate Etc., she often calls him Bob. Some Tisdale graphs are exactly the same ones produced by NOAA, NASA, etc. It’s data and graphing program.
Koonin does have a point about the “interpretive quotation.” The rhetoric of “Basically what he said is…” is poor form, IMO.
Joshua,
I do think that the effort one puts into rebutting another’s argument, to as certain extent, reflects the rebutters respect for the person making the argument. I can see why Gavin might have put little effort into this. A response on WUWT would – in my view, at least – justify that lack of respect.
To follow up a bit, to me the 1% issue that I highlight in this post is a bit of a red flag. Koonin has made this argument before. He’s had very well-informed people point out that it’s a silly argument. Yet, he repeats it. Either he can’t get the basic point, or he’s being disingenuous. Yes, it would be good if we always put effort into carefully rebutting what people say. In some cases, it doesn’t feel like it’s worth the effort. Also, complaining about the quality of the rebuttal is itself a bit of a tactic – the quality of Koonin’s arguments don’t depend on the quality of the rebuttal.
‘Better debunking please”
Doesn’t make a difference. There is yet another “the rise in atmospheric co2 is natural” thread at WUWT, to which the audience (with a few exceptions) is pretty welcoming. That tells you everything you need to know about putting energy into debunking WUWT threads.
There is a new Herman Harde paper as well 😦
Dikran,
Do you have a link for the new Hermann Harde paper?
I think there may be a comment in moderation on the Harde thread that has a link (perhaps I got the tags wrong), but I’ll see if I can find it again.
Found it again here. I think the publisher was on Beale’s list. He still doesn’t understand the distinction between residence time and adjustment time.
I only see one Tisdale graph, and it has to do with 30-year trends, which Gavin discusses. Gavin says it displays HadCrappy 3 data, but on the video it now uses GISS. So they changed the video?
> Just quote the person and refute the implications of what they actually said.
Quoting Gavin might have shown him [how to] do exactly that:
http://www.realclimate.org/index.php/archives/2019/06/koonins-case-for-yet-another-review-of-climate-science/
Gavin’s “basically” basically works as an implication.
Cue to poor faith.
Anders –
I can’t agree. If he’s going to take the time to put up a post, he shouldn’t use poorly formed arguments.
It’s not a huge deal but the “he basically said…” form of “interpretive quotation” is a pet peeve of mine. It allows someone to distort what someone else said to reframe it as a rhetorical device. I consider it poor faith argumentation. Just quote the person and refute the implications of what they actually said.
The posting at WUWT is an interesting but not all that surprising development. Koonin being formerly from the Obama administration might suggest he falls outside the typical ideogical taxonomy of climate change polarization. By posting at WUWT he undwrmines that possibility.
Joshua,
I mostly thought Gavin’s arguments were fine, but I may have understood what he was getting at.
This is my interpretation of “basically”. As soon as you add some implication, then the person can always say something like “that isn’t what I was implying”. What’s sometimes interesting is that they challenge the rebuttal by claiming that the interpretation is wrong, but rarely then say “I agree with what the rebutter was saying and simply expressed myself poorly”.
“My opinion is that no credible climate scientist should have their name associated with this political theater as it’s just a side show to distract attention from the massive roll back of environmental policies.”
I would prefer lay-people to write such responses. A scientist responding unavoidably gives the wrong impression that this is a scientific debate. ATTP, HotWhopper, the Science of Doom and Moyhu show that you do not have to be a climate scientist to write a clear response. I wish we had more such blogs.
FWIW, here’s an automated transcript:
https://www.diycaptions.com/php/get-automatic-captions-as-txt.php?id=FY5gEwZHKI8&language=asr
Compare:
Contrast (without checking accuracy):
Even if accurate, not having an official text is a pain.
I hate appeals to YT vids.
> I wish we had more such blogs.
There’s a simple way to make that happen:
Academics should know better than to replicate the inequalities they’re supposed to fight.
As soon as you add some implication, then the person can always say something like “that isn’t what I was implying”.
You’re opening the door and inviting someone in to poor faith engagement. I.e., I don’t see how that’s basically “claiming” that until you know everything you don’t know anything. It may not be an interpretation that Koonin owns, and with good reason.
Koonin’s argument was caveated, as he pointed out.
The contention (IMO) should be whether that “nails it,” not what he did or didn’t “basically claim.”
Joshua,
I’m not entirely following your argument. However, it is difficult to interpret what Koonin was implying as wildly different to what Gavin suggested. For example, I think the scientific community is now pretty much agreed that although there will be variability, it will clearly not dominate on multi-decade timescales over the forced response. If you’re giving a general seminar (which is essentially what Koonin was doing) why highlight the importance of variability if there is general agreement that (with regards to long-term climate change, at least) it’s not that important? If Koonin had some argument as to why this scientific position is wrong, that would be different, but he doesn’t really.
Just to follow up, Gavin did provide the timestamps in the video, so people could go and check what was said for themselves.
While I believe that CO2, rising temperature, and global warming are serious issues sometimes it seems that some of the “denier” sites such as WUWT and Euan Means Energy matters over focus on the CO2 relationship to global warming issue to the exclusion of other “environmental” issues.
The investor and asset manager Jeremy Grantham has dedicated many of his firm, GMO, quarterly reports to climate change issue. He has commented that along with CO2/global warming the other big issues are food, overpopulation, water scarcity and environmental toxins.
Click to access jgletter_resourcelimitations2_2q11.pdf
Click to access 201808-jeremy-grantham—the-race-of-our-lives-revisited.pdf
The climate deniers over focus on CO2 seems like the old Pink Panther movie where Peter Sellers as Inspector Clouseau over focuses on the organ grinder’s monkey while the bank behind the organ grinder is being robbed.
Joshua – the problem is that many orators are skilled at implying a lot without committing themselves to anything. Can you provide a better strategy for dealing with that? Mine is to ask direct questions that would make someone’s position clear, but that just results in evasion, which doesn’t address the implication. Personally in a rhetorical debate, that is not in good faith on both sides, there is no optimal strategy (other than not to be the one arguing in bad faith) and pointing out what you think is implied seems not unreasonable.
Dikran –
I think direct questions so as to get implications explained explicitly is the best approach. I note that in comments here you have said that it reflects poor faith to not answer direct question, and I have thought about you saying that when I’ve found myself seeking ways to not answer direct questions.
As I said, that’s what I do, it doesn’t work, because bad faith orators don’t want to commit themselves, so they just evade the question, and the implication has still been made. Good orators tend to be quite good at evasion and can quite easily turn questions into “demands” or “sealioning” if you have to ask more than once etc.
For me the key is not to mind being shown to be wrong if you actually are wrong. It’s at least slightly better than continuing to be wrong, even if not actually enjoyable. ;o)
Dikran –
As I said, that’s what I do, it doesn’t work,..
It works for me, as much as I expect for anything to work.. so I guess I see what works as a judgement call.
Do you know from the outset that you have a bad faith partner? If so, then nothing will work to reach quality dialog as that is not a shared goal.
If you don’t know, then it seems to me asking questions so as to avoid burdening someone with implications they don’t own and to give them room to clarify and to discuss logical implications would be optimal. It gives someone an opportunity to engage in good faith.
I’d you do know they’re engaging in bad faith, then you might have an eye toward the observer, especially in the online dialog context.
For this observer, when Gavin describes implications that seem to me unintended (was he really implying that until you know everything you know nothing? I would think not. I think he was implying that we don’t know enough to reach meaningful conclusions), I see a missed opportunity. I’d rather see an explication of why what we know supports reaching certain meaningful conclusions, or why what we know can’t be dismissed as meaningless for reaching certain concussions. I learn nothing from Gavin characterizing Noonin’s arguments in such a fashion. If Koonin wishes, he can dispute those points. If he fails to do so, he becomes useless for my opinion formation process.
I don’t know that I’m a representative sample, however. Of course, others have different definitions of what works. I struggle, however, to see how the “basically, what he said…” form of argument works. IMO, at it’s least suboptimal, qit just serves to reinforce existing lines of polarization – which doesn’t fit my definition of works.
Joshua – generally I tend to speak to people I have spoken to before, but I have noticed that most people will refuse to answer direct questions that may weaken their argument – to some extent it is human nature. It is disappointing to see this sort of thing from academics though, as their training ought to include suppressing that sort of unhelpful bias.
Asking questions (and) answering them ought to be effective in good-faith truth-seeking discussion, for both parties, and I’ll continue to do so, but it is not effective in online discussions, as people generally seem more interested in “winning” than in getting to the truth.
I try to stick to the content (or lack of) in the discussion and avoid motivations – I’m not a mind reader, and I don’t think it actually helps much.
As to “until you know everything you know nothing”, that is an exaggeration, but it is a very common argument in on-line discussion of climate. Asking for a “how do we know what we know”, as ATTP points out, this has been (repeatedly) presented already, and takes some effort, while just repeating phrases about uncertainty is cheap. This can be used as a rhetorical device, and I don’t think it is reasonable to expect inexhaustible patience from people like Gavin, when detailed explanations are repeatedly ignored. There is also some onus on the listener to question what is said and to learn for themselves, especially if they ave decided to take a position that goes against mainstream expert opinion.
I presume you’re referring to this slide at around 19m: https://youtu.be/FY5gEwZHKI8?t=1141
I think you missed the point.The models are calibrated to match the observations from 1950-1980. You can see from the slide that those models that deviate most from the historical record pre-1950 also appear to have the largest deviation post-1980. So in fact it does seem to be the case that inability to match the historical record is relevant when considering whether we should trust that model’s response to perturbations.
drjonbee,
No, most models are not calibrated to fit all, or some of, the historical temperature record. Most are tuned to either a pre-industrial radiative imbalance, or a present day radiative imbalance. See here, for example.
drjonbee, ATTP,
The Kooninism employed in discussion of Figure 9.8 …

is in reference to absolute temperatures of the individual CMIP5 models (shown on the RHS of each sub-figure). I’d call that one a red herring fallacy.
We are really only interested in the rates of change much more so than we are in actual absolute temperatures (within a reasonable tolerance of a few degrees C) at the global scale (which is what that figure represents). Also the baseline is 1961-1990 (yellow shading).
Click to access WG1AR5_Chapter09_FINAL.pdf
Gavin discusses this aspect of the models here …
Absolute temperatures and relative anomalies
http://www.realclimate.org/index.php/archives/2014/12/absolute-temperatures-and-relative-anomalies/
ATTP: Regardless, the correlation between the fit pre-reference period and post-reference appears significant. IOW, veracity does matter (also, if it didn’t, why even bother plotting?).
Everett: If you’re only interested in rates of change then you should be comparing the derivative of each series with the derivative of the historical record, and in that case you’ll find the fit is far worse.
Drjonbee it is ironic that you say veracity does matter after failing to acknowledge that you initial claim was factually incorrect.
drjonbee.-
Were you wrong when you said this?:
The models are calibrated to match the observations from 1950-1980.
If so, could you explain why you got that wrong?
Dikran –
Sorry, we crossed.
Anyway, that’s another aspect of good faith exchange, innit?
drjonbee.
They (the gross rate of change) already are as plain as day to see on the figure shown. Derivatives (e. g. finite differences) always show lower S/N ratios. So to overcome low S/N ratios some form of soothing is required (e. g. LOESS), see, for example:
Numerical differentiation by the regularization method (1985)
https://onlinelibrary.wiley.com/doi/abs/10.1002/cnm.1630020611
(paywalled but it does contain the source code)
or …
Data smoothing and numerical differentiation by a regularization method (2010)
https://www.sciencedirect.com/science/article/pii/S0098135409002567
(paywalled)
Abstract
While data smoothing by regularization is not new, the method has been little used by scientists and engineers to analyze noisy data. In this tutorial survey, the general concepts of the method and mathematical development necessary for implementation for a variety of data types are presented. The method can easily accommodate unequally spaced and even non-monotonic scattered data. Methods for scaling the regularization parameter and determining its optimal value are also presented. The method is shown to be especially useful for determining numerical derivatives of the data trend, where the usual finite-difference approach amplifies the noise. Additionally, the method is shown to be helpful for interpolation and extrapolation. Two examples data sets were used to demonstrate the use of smoothing by regularization: a model data set constructed by adding random errors to a sine curve and global mean temperature data from the NASA Goddard Institute for Space Studies.
drjonbee. (delete this one if it is a double post TIA)
They (the gross rate of change) already are as plain as day to see on the figure shown. Derivatives (e. g. finite differences) always show lower S/N ratios. So to overcome low S/N ratios some form of soothing is required (e. g. LOESS), see, for example:
Numerical differentiation by the regularization method (1985)
https://onlinelibrary.wiley.com/doi/abs/10.1002/cnm.1630020611
(paywalled but it does contain the source code)
or …
Data smoothing and numerical differentiation by a regularization method (2010)
https://www.sciencedirect.com/science/article/pii/S0098135409002567
(paywalled)
Abstract
While data smoothing by regularization is not new, the method has been little used by scientists and engineers to analyze noisy data. In this tutorial survey, the general concepts of the method and mathematical development necessary for implementation for a variety of data types are presented. The method can easily accommodate unequally spaced and even non-monotonic scattered data. Methods for scaling the regularization parameter and determining its optimal value are also presented. The method is shown to be especially useful for determining numerical derivatives of the data trend, where the usual finite-difference approach amplifies the noise. Additionally, the method is shown to be helpful for interpolation and extrapolation. Two examples data sets were used to demonstrate the use of smoothing by regularization: a model data set constructed by adding random errors to a sine curve and global mean temperature data from the NASA Goddard Institute for Space Studies.
Joshua: “The models are calibrated to match the observations from 1950-1980.” is what I thought I heard Koonin say in passing. By “calibrated to match” I did not mean “tuned to fit” as claimed by ATTP. I meant the *level* of the series was adjusted to match the mean in that period, which is actually not a big deal. It turns out the reference period is 1960-1990 (not 1950-1980) , and if the level is not at least implicitly matched to that period I’d be surprised since the models all line up there even though they deviate significantly either side. But it doesn’t matter for the point: the deviation pre and post reference appears to be correlated.
Everett: yes, I understand how to smooth a series and that derivatives are very difficult to estimate from a timeseries of values. But LOESS is not going to save you here. Finite differences or smoothed, the fits will be awful. There’s too much internal climate variability that’s not modeled. You could try estimating the derivative directly from the models, which would give you a better estimate of the derivative as predicted by the models, and hence reduce that aspect of the noise, but it would most likely just show how little of the internal variability of the real climate is captured.
Drjonbee that sound like baselining, which has nothing whatsoever to do with model tuning.
See this post at realclimate for an explanation of why the variation in absolute temperature between model runs isn’t of great importance for predicting the response to a change in the forcing.
Dikran: correct – that’s what I meant by “calibrated” and why I did not use the word “tuned”. Calibration implies level matching only (that’s also the common scientific usage) so that you can make an apples-apples comparison.
. By “calibrated to match” I did not mean “tuned to fit” as claimed by ATTP.
There seems to be some inconsistency in terminology.
Anders said most aren’t “callibrated to fit” not “tuned to fit” as you claimed he “claimed.”
Dikran I understand why predicting the derivative is not relevant. Everett made that claim: “We are really only interested in the rates of change much more so than we are in actual absolute temperatures”.
My point remains: models that deviate most pre-reference also deviate the most post-reference. More specifically: pre-reference deviation is almost exclusively to the downside (the models run too cold or in fact are initialized too cold), whereas post-reference the deviation seems to be to the upside, if you factor out overreaction to Pinatubu which unfortunately happens right after the reference period ends, so is a shock right after calibration (baselining if that’s the climate science vernacular) and hence undoes the calibration to some extent.
A model that runs cold pre-calibration, and hot post-calibration, almost certainly does so because it is too sensitive to some monotonically changing input that drives temperature. Most likely the culprit is CO2 sensitivity, since I don’t think there are any other monotonic inputs that have that much effect on model temperature?
Joshua: ATTP: “Most are tuned”. He was assuming that my “calibration” meant “tuned” (as were you). Drop it. It’s resolved.
BTW – the use of “claim” often implies a somewhat negative connotation. Was that intended?
No, it has nothing to do with calibration either, it is just the calculation of climate anomalies relative to a common baseline period. It isn’t calibration as the models are not being calibrated to anything (there is no “matching”).
As a reminder of your original claim “The models are calibrated to match the observations from 1950-1980.”. This is not correct, the models are not calibrated to match the observations. The individual model runs and the observations are independently baselined to produce anomalies.
Have you read the realclimate article that explains why this is essentially irrelevant to the model’s skill in predicting the response to a change in the forcing?
“Most are tuned”. He was assuming that my “calibration” meant “tuned” (as were you). Drop it. It’s resolved.
Except he didn’t say “tuned to fit” as you quoted him as saying. There are multiple terms in play. It doesn’t help to clarify when you misquote.
Drjonbee “Dikran I understand why predicting the derivative is not relevant.”
It would really help if you were to avoid misusing terminology. A difference is not a derivative,
Joshua “claim” = “stated”. Not intended to be pejorative. “tuned to fit” was not intended as a quote of ATTP, but to convey the meaning. This has been clarified. Move on.
I’ve stated the substance of my point.
Also –
as were you)
Actually, I wasn’t. You’re definitely wrong about that. I was noting the difference in terminology so as to clarify. That is also why I’m noting that Anders didn’t use “fit” as in “tuned to fit,” but you quoted him as saying “tuned to fit.”
Dikram, a difference is an estimate of a derivative. I am not misusing terminology.
I am not arguing semantics. You know what they say about wrestling with pigs. If either of you have something specific to address in the main point about correlation between model deviation pre-and-post calibration, I’ll answer that.
. “tuned to fit” was not intended as a quote of ATTP,
Ok. That seems to run counter to a common sense interpretation – as you said he made a claim about “tuned to fit” using quotation marks. But if that wasn’t what you intended, fair enough.
You know what they say about wrestling with pigs.
Was that also not meant with a pejorative connotation?
I was trying to clarify terminology so as to understand the differences in opinion. I’m not sure that merits pig analogies.
I am not arguing about arguing about semantics either.
What is really weird here is that Koonin never mentions in any way shape or form the ‘so called’ issue that drjonbee raised here (absolute temperature has a direct 1st order effect on GMST trends).
At best, that is a conjecture, unless drjonbee can point to some peer reviewed psper(s). In other words, I’m not seeing whatever it is that drjonbee is seeing (specifically with regards to Figure 9.8).
Note to Self: EMIC = Earth (system) Model of Intermediate Complexity. As such, we should limit this discussion to Figure 9.8a …
Earth systems model of intermediate complexity
https://en.wikipedia.org/wiki/Earth_systems_model_of_intermediate_complexity
I am not arguing semantics.
I’m assuming that you meant that idiomatically. But the line between the idiomatic use of semantics and the literal use can be hard to draw.
Actuality, you were very much arguing semantics in a literaral sense. And I was trying to clarify semantics – as in meaning of words.
Anyway, I’m off. I’ll let the people discuss the science.
Everett: that’s not the issue I am raising. Koonin pointed out that the models miss by more than the effect they are trying to measure. The counter (from ATTP and others) seems to be that that doesn’t matter because it is response to perturbation for any individual model that matters. My point is that the data suggest otherwise (Fig 9.8(b)): those models with the largest deviation pre-calibration period also have the largest deviation (of the opposite sign) post-calibration period. So being “off” in absolute terms is predictive of a model’s response to perturbation.
Drjonbee “Dikram, a difference is an estimate of a derivative. I am not misusing terminology”
O.K. Now you have made it clear that you can’t simply admit your error. A difference between samples of a signal is an estimate of a derivative. A constant offset clearly is not.
I find it hard to understand why people dig the hole deeper and deeper like this, rather than to simple concede their error.
Btw it is a bit of a statement of the bleedin’ obvious that models with a higher ECS will tend to have be above the mean after the baseline period and lower before. If a model has higher ECS, it would be a bit of a surprise if it predicted less warming on a centennial scale, where the change is dominated by the forced response!
drjonbee,
“Koonin pointed out that the models miss by more than the effect they are trying to measure.”
Which means it has to be absolute temperature. D’oh! The RHS of both sub figures. Plain as day.
Without knowing each model’s ECS (an emergent property, at least for AOGCM’s) we can’t say how/why the trend lines differ. We don’t know enough of the details to determine how the EMIC’s go about establishing their absolute temperatures. In other words, that figure alone does not tell us anything as it is clearly a function of more than one variable.
Also, as it has been pointed out that there is a specific term for this operation (baselining) and calibration is not appropriate as the signals are not being calibrated to anything, it is pretty unhelpful for you to continue to use that term.
drjonbee,
Okay, I’ve been out all morning, and haven’t had a chance to read all the comments. Okay, what you’re highlighting is that the models and observations are all baselined to the 1950-1980 period. This is pretty standard, given that what we’re interested in is how it responds to perturbations (i.e., how it changes). To compare changes, you do typically need to use a common baseline.
Drjonbee wrote “So being “off” in absolute terms is predictive of a model’s response to perturbation.”
You didn’t read the realclimate article, did you? ”The next figure shows that the long term trends in temperature under the same scenario (in this case RCP45) are not simply correlated to the mean global temperature, …“
Drjonbee wrote “My point is that the data suggest otherwise (Fig 9.8(b)): those models with the largest deviation pre-calibration period also have the largest deviation (of the opposite sign) post-calibration period. So being “off” in absolute terms is predictive of a model’s response to perturbation.”
You can’t tell how far a model run is “off” in absolute terms from anomalies as anomalies have already had the absolute offsets subtracted off.
What you can tell is the slope, from which you might estimate climate sensitivity. Yes, the climate sensitivity of a model is predictive of its response to perturbation (again an inappropriate term for which a better one already exists – “forcing”). We ALL know that already – that is precisely why ECS is a reasonably useful metric.
Just to follow up on my previous comment. The observations themselves are anomalies, which are all relative to a baseline. It doesn’t make sense to compare models and observation unless you use a common baseline. This has nothing to do with calibration or tuning/fitting.
Here is what Koonin actually said (see, hear, CC’a) …
“Finally. the climate system has long term coherencesin it that the models do not reproduce. Let me show you just how badly the models reproduce some aspects of the climate. This is the global temperature (IPCC AR5 WG1 Figure 9.8) modeled over the last hundred and something years.
That black line is the observations. The spaghetti lines of all of these different models that are used by the IPCC and this is really misleading because, in fact, the average temperature of each model differs as shown over here (presumably the RHS) by as much as THREE degrees centigrade (definitely the RHS) and they’ve all been adjusted to line up to zero for the period 1950 to 1980. So, in fact, when the models differ by THREE degrees (centigrade), which is more than what you’re trying to explain, it gives you some pause that you’ve got it right (Koonin then circles both RHS absolute temperature scales just before moving on to his next slide (@19:10).”
RHS! THREE degrees (centigrade) difference in absolute temperature!
Someone here needs to concede the word-for-word nature of Koonin’s discussion of that particular slide.
EFS that sounds like exactly the issue addressed in the realclimate article I posted earlier.
The funny thing is, that anyone that had ever downloaded the model output would immediately find that out for themselves. The fact this appears to be news to a lot of skeptics (ISTR a recent WUWT article on this) tells you their skepticism of the models doesn’t extend as far as actually finding out for themselves what the models actually predict.
I thought that was also pretty much one of the two things that I discussed in this post. We know that the models don’t match when it comes to absolute surface temperature. However, if the response is likely to not depend strongly on the base state, then this is unlikely to be a major issue.
ATTP – yes I am in total agreement. That’s what I mean by “calibration” here. I am not familiar with the climate science terms of art. I shall henceforth call it “baselining” so as to avoid confusion.
However, now I rewatch the relevant segment of the video, I realize Koonin was talking about something different. I had assumed he was referring to the anomaly on the left-hand axis (it’s hard to see the laser pointer in the video). I actually agree with his statement “gives you some pause that you’ve got it right” when referring to the difference in mean between the models, since it is pretty substantial. He addresses this in his response to Gavin on WUWT which I also just read:
“indeed, on average the 2011-2070 trend of the CMIP5 models under RCP4.5 decreases about 20% for every degree C increase in 1951-1990 absolute GMST”
I presume he means 1961-1990 if that’s the baseline period. If Koonin’s statement is correct, it does suggest that the model predictions are dependent on their baseline temperature.
But anyway, I was talking about the anomalies, so I apologize for the confusion. And I think it’s obvious from the anomalies that the majority of models are too sensitive to CO2. The models to the left of the reference period are too cold – the deviation is highly skewed to the negative. Which means that the models must be *initialized* cold. Factoring out the overreaction to Pinatubo by some models, and accounting for the shorter period, the deviation after the reference period is skewed to the positive.
‘And I think it’s obvious from the anomalies that the majority of models are too sensitive to CO2. ‘
No, that isn’t obvious. All you can reasonably expect is for the observations to lie in the span of the runs of a model, and even then the span underestimates the full uncertainties (e.g. parameter uncertainties, scenario uncertainty etc.). It isn’t as simple as you are suggesting.
drjonbee,
I don’t think this is obvious at all. For starters, the models/obs comparison is pretty. There are also some issues with how these comparisons are typically done. The observations tend to be sea surface temperatures combined with 2m air temperatures over land. This introduces a bit of a bias. If you correct for this bias (by either trying to compensate for it in the observations, or by extracting equivalent “observations” from the models) the comparison actually gets even better.
I don’t quite follow this, but I get the sense that you still don’t quite get the significance of baselining. If everything is baselined to the same time period, then whether or not some model was typically colder than other should no longer matter.
DM, ATTP,
Yes and yes. I myself posted Gavin’s link (my 1st comment here) …
https://andthentheresphysics.wordpress.com/2019/06/18/kooninisms/#comment-158444
Then I tried to explain what Koonin actually said and pointed to Gavin’s link as a counterpoint.
ATTP: I am only going from Fig 9.8(b). Whichever way you cut it, the models skew negative prior to the reference period. And if the argument is that comparison between the temperature record and the recalibrated (rebaselined?) model predictions is irrelevant, what is the point of the figure?
I am not comparing the models to each other but to the historical record. And that’s the point: the models are not “typically colder” than the historical record. They’re typically colder *before* the reference period and typically hotter *after* the reference period.
So if I take a typical model, adjust the mean so it aligns with the the historical record during the reference period (i.e. I baseline the model to the historical record, as is done in Fig 9.8(b)), then the model’s predictions will typically fall below the historical record for the period prior to the reference period, and exceed the historical record for the period after the reference period. What’s important to note is that we do not have models exhibiting radical departures from this behavior. That is, we don’t have models that varied hot and cold in the pre-reference period and continued to do so post-reference. The models appear to either track the historical record fairly closely (only a few do this) both pre-and-post reference, or they’re too cold pre-reference, and too-hot post reference (by definition of the reference period and the baselining process they all align with the historical record during the reference period).
How can we interpret that? Well, for starters, since the models are initialized in the past (we’re not running them backwards in time), that means they’re initialized too cold. Secondly, the models are likely too sensitive to an input that changes monotonically with time, since the anomalies themselves are approximately monotonically increasing functions of time. The most obvious culprit: they’re too sensitive to CO2.
An easy way to make my point: change the reference period so that the pre-reference period is equal to the current post-reference period in length. So 1890-1920. Redo Fig 9.8b(2) after “re-baselining” to the new reference period. The post-reference anomalies will now all be skewed positive (it will be obvious the models are running hot), while the pre-reference anomalies will be slightly negative.
Now, I know there are probably issues with baselining too far in the past, but I think you get the point.
drjonbee,
Except, as I’ve already pointed out, if you to a like-for-like comparison and update the forcings, then any models/observation discrepancy becomes relatively small. Also, as others have pointed out, the observations is a single realisation of reality. The models essentially consider a range of possible realisations (i.e., the impact of internal variability isn’t the same in all models). Hence one should be cautious of assuming that some discrepancy is the models running too hot. There are indications that internal/natural variability has lead to some cooling.
Isn’t this just arguing that Fig 9.8(b) should be ignored?
If the observations are to be ignored because they are a “single realization of reality”, even though they show a *systematic* bias in the models, then we’re not really doing science. Most scientists would be very suspicious of claims that a theoretical simulation should be believed over actual data.
What’s that famous Feynman quote?
> If the observations are to be ignored
They’re not, and Feynman is wrong, e.g.:
https://journals.ametsoc.org/doi/full/10.1175/JTECH-D-16-0121.1
If you see a Yeti in your binoculars, check your binoculars or make sure others can see it too.
drjonbee,
Nobody is suggesting ignoring the observations. The suggestion is simply that you can’t infer that the models are running hot because the model mean doesn’t precisely match the observations. We don’t necessarily expect the model mean to do so. James Annan has quite a nice explanation of this issue. I also wrote a post that touches on this issue.
drjonbee, see https://www.clim-past.net/9/1111/2013/cp-9-1111-2013.pdf
“An easy way to make my point: change the reference period so that the pre-reference period is equal to the current post-reference period in length. So 1890-1920. Redo Fig 9.8b(2) after “re-baselining” to the new reference period. The post-reference anomalies will now all be skewed positive (it will be obvious the models are running hot), while the pre-reference anomalies will be slightly negative.”
It’s already been done. See Eby, et. al. (2013) Figure 2b (Figure 9.8b are single model runs all taken from Eby) baseline is 5-year mean centered on 1900 …
Historical and idealized climate model experiments: an intercomparison of Earth system models of intermediate complexity
https://www.clim-past.net/9/1111/2013/cp-9-1111-2013.html
(my count out off 12 single models run, five (or six) higher, two lower, five (or four) about the same)
ATTP,
I am not worried that the model mean doesn’t *precisely* match observations. I don’t expect the models to track unmodeled internal variability, for example. Also, just to be clear, I am not referring to the mean of all the models (labeled EMIC mean in Figure 9.8(b)?). I am referring to the systematic bias shown by the ensemble of models. That can’t be denied. Well, it can, but only by arguing the observations are biased, not the models. I don’t think I am out of line in my belief that most scientists would place the burden of proof on those who claim the observations are biased.
I couldn’t see the relevance of the James Annan post. As far as I can tell, he is addressing the variance of the model predictions, not their bias?
drjonbee, I’m really not seeing this systematic bias that you’re claiming exists. Even without correcting for coverage bias, blending and updating the forcings, the observations lie within the envelope of the models.
Not too shabby … from the paper …
“For the specified external forcings over the 20th century, five models appear to stay mostly within the observational uncertainty envelope for this period, five tend to overestimate the observed trends, and two tend to underestimate the trends (Fig. 2b).”
The paper itself only ever talks about observational uncertainties, not even a squeak of EMIC modeling uncertainties (e; g; no ensemble averaging) AFAIK.
> I’m really not seeing this systematic bias […]
Let’s spell dr’s argument: unless the observations are right in the middle of the envelope, there is systematic bias.
More on the technique at mt’s.
Everett: I can’t reconcile Fig 2b and 9.8b. The negative bias in 9.8b is much greater. But 2b does show that by baselining on 1960-1990, 9.8b creates a misleading impression of low variance of future predictions of the models. Do we really believe that if the last century was replayed a bunch of times (with the same forcing timeseries) we’d see a similar spread in observations as those shown by the models in 2b? It’s hard to argue that we’ve extracted the signal from the noise, at least insofar as these kind of models are concerned.
ATTP: this:
The observations (solid black) lie solidly in the upper part of the model envelope, with many models always well-below the observations, but none always above.
> none always above
You keep repeating this, dr. AT keeps telling you this may not imply what you make it imply. The relevant bit in James’ post seems to be this one:
http://julesandjames.blogspot.com/2010/01/reliability-of-ipcc-ar4-cmip3-ensemble.html
When theories and observations conflict, sometimes revising our interpretations is the way to go.
Either you bring something else, or that exchange has ran its course.
Thank you for your concerns.
Willard: Go back and reread what I wrote. You are wrong.
I’m out for a while. Maybe we can pick it up again at a later stage.
Sure – I think we’ve probably exhausted this round anyway. I appreciate the responses.
> Go back and reread what I wrote
Please beware your wishes, as I already met your objections. Also, you might profit from rereading James’ post, starting with it’s all based on some analysis methods that are fundamentally flawed.
Drjonbee wrote “The observations (solid black) lie solidly in the upper part of the model envelope, with many models always well-below the observations, but none always above.”
As I said before, you can only reasonably expect the observations to lie in the span of the model runs, nothing more. If the observations lie in one half of the envelope, that does not imply bias. If you don’t understand why that is the case, you ought to do more reading and less telling people that they are wrong.
“What’s that famous Feynman quote?
It doesn’t matter how beautiful your theory is, it doesn’t matter how smart you are. If it doesn’t agree with experiment, it’s wrong.”
This is assuming that the experiment (or observation) is correct. Happily there is no such thing as proof by Feynman. In this case, the models are consistent with the observations, as all we can reasonable expect is for the observations to lie in the spread of the model runs. I think I may have already mentioned that.
drjonbee –
You say:
If the observations are to be ignored
Anders says:
Nobody is suggesting ignoring the observations.. The suggestion is simply that you can’t infer that the models are running hot because the model mean doesn’t precisely match the observations.
And then you say:
I am not worried that the model mean doesn’t *precisely* match observations.
So I have a few questions. Sorry that they will be uninteresting for those who understand the science (or perhaps too naive to even be answerable) but I’m back and trying to parse the conversation so as to at least get a sense of the logic of the arguments being presented. (Not an easy task – perhaps an impossible task. )
Is it possible to conclude that the observations aren’t sufficient to draw the conclusion that you draw without actually “ignoring” the observations?
Is there a space between Anders’ use of “precisely” and your use of “precisely?” In other words, is it possible that you can’t infer that the models are running hot because
the model mean doesn’t precisely match the observations.of the fact that it doesn’t doesn’t march the observations, or because of the degree to which it doesn’t match the observations?(the “it” above being the model mean).
Ev said:
Thanks for the info! I think this may be related to the idea that if you have some characterization of the noise, you can use that info to do a better job of filtering, a la Kalman filtering. I will look around for some explanations on why it works.
I can understand DrJonBee’s point by looking at Figure 9.8b, so I assume at least a few other people could understand the issue.
IMO, it could be related to the quasi-60-year cycle that modulates the temperature time series, and that groups at NASA JPL and Paris Observatory link to the Earth’s rotation variation *. This modulation isn’t fit by any of the models so continues to be a nuisance in making comparisons between the pre-baselining years and now.
* Curry has co-opted a name for this 60-year variation and calls it the “stadium wave”.
> it could be related to the quasi-60-year cycle that modulates the temperature time series
Coup de théâtre!
Drive-by done, Paul.
Poor Feynman.
JCH — Huh?
They actually believe Feynman would be a skeptic. That he would join Koonin on the red team:
In a strictly physical sense, anthropogenic warming is indeed a small perturbation, 1% is not a bad back-of-the-envelope estimate. Anthropogenic radiative forcing is, order of magnitude, 1% of the incoming solar radiation. Warming of 3K is 1% of the 300K temperature of the planet.
The fact that anthropogenic forcing is physically small is why we can successfully consider climate change as a linear problem, at least for some aspects of climate change. For example, linearity lets models get warming right even when they have errors in the mean state, a crucial feature of climate modeling. Further, this kind of back-of-the-envelope estimate, ~1% change in forcing leads to ~1% warming, gives confidence that more detailed warming estimates are in the right ballpark. Physics works and climate change is basic physics in action.
This kind of analysis can’t distinguish between 1K, 3K, and 6K sensitivity, i.e. mild vs. catastrophic warming. Impacts are extremely sensitive to physically small differences in warming. Koonin’s argument is a shell game, use some valid physical reasoning and then draw incorrect conclusions. But the physical reasoning, when not misused, is valid, important, and enlightening.
ProfJ,
Yes, I agree that we can often regard the response to the radiative perturbation as being linear and that a key aspect is that the impacts of this warming is likely to be non-linear. However, I would suggest that by Koonin’s argument we would still expect a radiative pertubation of ~10 W/m^2 to still be linear (i.e., small relative the ~240W/m^2 solar forcing), but I think would be entering the regime where we might expect non-linearities to start playing a role (i.e., no longer all that small relative to the 150 W/m^2 radiative effect of the greenhouse effect).
So…
It looks like I won’t be getting any answers from drjonbee to my questions…
So here’s the thing. I think it’s reasonable to note that modelers fairly often are overconfident about their ability to model reality. As such, it seems reasonable to me to argue in favor of observations as a way to assess risk as we go forward.
But, of course, on the other hand we know that given the long time horizons involved, and the possibility of low probability but high impact risks, and the question of signal versus noise in the fairly limited time we’ve had to observe the complex reaction of the climate to ACO2 emissions, relying on observations to assess risk has some serious shortcomings.
And then on the third hand there is that the reality that although it’s reasonable to value observations vs. modeling within some generic decision-making matrix, we all rely on the modeling of situations where we don’t have sufficient observational data frequently in our day to day lives. Doing so is a regular part of life these days, even though we may not always be explicitly aware that’s what we’re doing.
And then on the fourth hand, there seems to be a tendency that those who argue the most in favor of observations vs. modeling for assessing climate risk from ACO2 emissions, regularly, almost uniformly in fact, rely on the modeling of complex phenomena for which we have limited observational data to predict economic disaster from ACO2 emissions mitigation.
So when someone writes “If the observations are to be ignored.” I see a whole lot of important context that has been left out. So I wonder why drjonbee says something like that. But I guess I’ll never know the answer.
Of course, I could chalk it up to an approach to dialog where he compared me to a pig (prolly no pejorative intended, right?) while making an fully confident yet completely incorrect assertion about something I said, and where he misquoted Anders but explained he wasn’t intending to quote him…
But one never knows, do one?
“I think it’s reasonable to note that modelers fairly often are overconfident about their ability to model reality. ”
What is your evidence for that?
You can’t make predictions with observations, only models (even if only implicit).
The main problem here (imho) is that not enough people really understand how climate modelswork and what the model output means (for instance why we shouldn’t expect the observations to be any closer to the ensemble mean than a randomly chosen model run from the ensemble – i.e. it should lie in the spread of the models*).
* although there are good reasons to think the spread underestimates the true uncertainties.
Joshua said:
This is what I think is DrJonBee’s observation in Fig. 9.8b. Why doesn’t the thin red curve highlighted in yellow slide upward? This is only based on my interpretation of the way he worded his concerns
Two research papers, two sets of models. (I found the SAI paper via the videos of the climatechange.ai conference held last month.)
1)
Researchers provide new evidence on the reliability of climate modeling
https://phys.org/news/2019-06-evidence-reliability-climate.html
“Historically, climate models have shown a progressive weakening of the Hadley cell in the Northern Hemisphere. Over the past four decades reanalyses, which combine models with observational and satellite data, have shown just the opposite—a strengthening of the Hadley circulation in the Northern Hemisphere. Reanalyses provide the best approximation for the state of the atmosphere for scientists and are widely used to ensure that model simulations are functioning properly.
The difference in trends between models and reanalyses poses a problem that goes far beyond whether the Hadley cell is going to weaken or strengthen; the inconsistency itself is a major concern for scientists. Reanalyses are used to validate the reliability of climate models—if the two disagree, that means that either the models or reanalyses are flawed. To find the cause of this discrepancy, the scientists looked closely at the various processes that affect circulation, determining that latent heating is the cause of the inconsistency. To understand which data was correct—the models or the reanalyses—they had to compare the systems using a purely observational metric, untainted by any model or simulation. In this case, precipitation served as an observational proxy for latent heating since it is equal to the net latent heating in the atmospheric column. This observational data revealed that the artifact, or flaw, is in the reanalyses—confirming that the model projections for the future climate are, in fact, correct.”
2)
Stratospheric Aerosol Injection as a Deep Reinforcement Learning Problem
Click to access 1905.07366.pdf
“As global greenhouse gas emissions continue to rise, the use of stratospheric aerosol injection(SAI), a form of solar geoengineering, is increasingly considered in order to artificially mitigate climate change effects. However, initial research in simulation suggests that naive SAI can have catastrophic regional consequences, which may induce serious geostrategic conflicts.
Wealthy countries would most likely be able to trigger SAI unilaterally, i.e. China, Russia or the US could decide to fix their own climates and disrupt the ITCZ, which influences the monsoon over India, as collateral damage. If geoengineering is ceased before the anthropogenic radiative forcing it sought to compensate for has declined, termination effects with rapid warming would result (Jones, 2013). Understanding both how SAI can be optimized and how to best react to rogue injections is therefore of crucial geostrategic interest(Yu, 2015).
…To our knowledge, this is the first proposed application of deep reinforcement learning to the climate sciences.”
https://www.climatechange.ai/ICML2019_workshop.html
“(for instance why we shouldn’t expect the observations to be any closer to the ensemble mean than a randomly chosen model run from the ensemble – i.e. it should lie in the spread of the models*).”
If true, the problem then is if the ensemble spread is large. For example, if I wanted to know the latest Nino 3.4 SST projections for January, 2020, I could look at this:
Ignoring the ensemble mean, the graphic indicates a value of – 1.0C is just as likely as +1.5C. Both are within the spread.
Not very helpful!
Dikran –
What is your evidence for that?
I don’t have what I would call evidence. I have anecdotal observations. I remember back in the mid eighties? people like Marvin Minsky making wildly off predictions about the ability to model human speech acts. Predictions that still haven’t been achieved.
My brother is a professor of signal processing….he speaks of over-confidence of modelers.
But obviously, he believes that some models are useful.
But I’m aware that there is a bias in play. I remember when modelers fail and take it for granted when models succeed. It would be interesting to see a scientiic approach (evidence) to quantify the question.
You can’t make predictions with observations, only models (even if only implicit).
Yah, that’s another aspect. Basically everything we know, we know because we model (often informally). It’s so easy for people to use modeling to denigrate modeling.
Paul (WHT) –
You might give it a shot to translate your response to my comment into more basic, specific logic. I might them have a chance at
understanding.
snape –
If true, the problem then is if the ensemble spread is large.
In what way is that a problem?
Joshua, I will let DrJonBee comment first with regard to my graphical interpretation of what he meant. If I got his intent right then I can try to elaborate further.
Joshua
I used an example to help explain my thinking. Not clear?
Let’s say I wanted to use an ensemble of models to help place a bet on the point spread of a Patriots/Browns game. One model says Patriots by 2 points. Another says Patriots by 40 points. Not helpful. I’d look elsewhere for advice.
If on the other hand, all the models converged around Patriots by 3 touchdowns, then that WOULD be helpful. I’d make the bet.
*****
Actually, though, the idea of a random model run being as good a bet as the ensemble mean seems like nonsense.
Paul (WHT),
Cross reference with the Eby (2013) paper, specifically Table 2.
GENIE release 2-2-7 (GE) from The Open University = GENIE (IPCC 9.8b)
MESMO v1.0 (ME) from the ´University of Minnesota = MESMO (ditto)
Read off the highest TCR and ECS values from Table 2 (hint GE/ME).
Contrary to what drjonbee sez (@ June 22, 2019 at 4:30 pm “Everett: I can’t reconcile Fig 2b and 9.8b”) I am having absolutely zero problems reconciling Figure 2b/Table 2 from the IPCC Figure 9.8b.
The issue is as I originally thought, the EMIC’s with the higher temperature sensitivity metrics (at least GE/ME) have steeper trend lines in Figure 9.8b (absolute temperature does not appear to have a significant influence in and of itself).
Snape –
I was being indirect. My point is that it is what it is. Again, what makes it a problem?
Is it a problem if I’m not sure how to evaluate the odds of an NFL bet? I think not.
The uncertainties exist. We have to move forward. Seems to me that saying the uncertainty is a problem is like trying to envision a way forward without uncertainty. We can’t. And the existence of uncertainty doesn’t alleviate us the necessity of moving forward. Uncertainty makes finding the best path forward complicated.
But uncertainty is not, imo, a reason to fail to decide which path forward we should take. Not choosing a path forward doesn’t make the uncertainty disappear.
.
“Again, what makes it a problem?”
Strange question.
What you’re asking is, “why is it a problem if something designed to be helpful is not actually helpful. Or, “why is it a problem if a tool intended in large part to reduce uncertainty fails to reduce uncertainty”. The tool in this case being an ensemble of model runs.
Sorry about the atrocious punctuation. I never seem to notice errors until after the comment is published.
Ev, thanks. Yes, the red line I highlighted was MESMO which has an ECS=3.7/TCR=2.4 according to Table 2. I guess this would be known to be too high (or steep in slope) in comparison to the observations. Again, I think DrJonBee’s issue is that this steepness is only apparent from the chart by looking at the specific line in years prior to the baseline interval of 1961-1990.
Snape –
Let’s say I wanted to use an ensemble of models to help place a bet on the point spread of a Patriots/Browns game. One model says Patriots by 2 points. Another says Patriots by 40 points. Not helpful. I’d look elsewhere for advice.
You have an “ensemble” of models that comprises 2 models? That’s a strange ensemble.
What you’re asking is, “why is it a problem if something designed to be helpful is not actually helpful.
My assumption is that model ensembles are helpful.
why is it a problem if a tool intended in large part to reduce uncertainty fails to reduce uncertainty
My assumption is that model ensembels reduce uncertainty.
I never seem to notice errors until after the comment is published.
Yeah. Me too. It’s weird.
Joshua
The models showing Patriots by 2 points and 40 points were meant to represent the extreme outliers, just like the – 1.0C and + 1.5C represent the extreme outliers in the Nino 3.4 ensemble.
Maybe I’m misunderstanding Daikon’s comment?
“(for instance why we shouldn’t expect the observations to be any closer to the ensemble mean than a randomly chosen model run from the ensemble – i.e. it should lie in the spread of the models*).”
My interpretation is that he believes the ensemble mean will give no better guidance than a run chosen at random from the spread, which could include the most extreme outliers. So, with this in mind, let’s go back to my bet. Imagine there’s an ensemble of ten runs giving the games final point spread:
Patriots by/
+2
+6
+10
+14
+18
+22
+26
+30
+34
+40
How, if each outcome was equally likely, would this help me with my wager? How would it reduce my uncertainty?
If the ensemble mean was considered the best bet, statistically superior to a random run (which seems logical), then that would be different. The ensemble would indeed be helpful. I’d bet on the Patriots by ~3 touchdowns.
Point spread in the football game not to be confused with spread of the model runs. 😏
Joshua wrote “But I’m aware that there is a bias in play.”
Indeed, your biased view of Minsky demonstrates that. He also published a book called “perceptrons” (with Seymour Papert) that killed off a lot of research on neural networks for a decade or so by pointing out the things they couldn’t do, and the lack of prospect for fixing the problem. He was wrong about that as well, but in the other direction.
My experience of modellers is different – they are no different to other scientists, where a mixture of self-belief and self-skepticism is required to succeed.
Snape said “My interpretation is that he believes the ensemble mean will give no better guidance than a run chosen at random from the spread, which could include the most extreme outliers. ”
No, that is not correct. If the models were perfect, then model runs and the observations would be statistically exchangeable – so the observations wouldn’t be closer to the mean that a randomly chosen model run. However that doesn’t mean that model runs are evenly distributed across the spread – they are more likely to be in the center than the wings,
Another way to look at it is that the ensemble mean is an estimate of the forced response of the climate system (as the averaging cancels the internal variability between runs). The observations are a combination of the forced response and a realisation of chaotic unforced internal variability. How close you can expect the observations to be to the ensemble mean depends on how great an effect you think internal variability can have – one estimate of which is the spread on the model runs.
Many caveats apply, of course, but the basic point is not to focus only on the ensemble mean but to pay attention to the spread, as it too is important.
I haven’t really been following this discussion, but what I was suggesting is that the ensemble mean largely smooths out variability and mostly would br regarded as representing the forced response. The real world is a combination of a forced response plus variability, so wouldn’t be expected to match the ensemble mean. Of course, if it were to stay on one side of it for a long time, we might then conclude that some of the models that make up the ensemble are not realistic representations of reality and would reject them, or modify them. The point, though, is that we shouldn’t simply use the comparison of the observed temperatures and the ensemble mean to make simplistic claims about models running too hot. I don’t think we’re yet at a position where we can make such a claim (especially, if you account for coverage bias and blending).
Easterling and Werner (2009?) show that both the models and observations occasionally have a period of a decade or two with little or no trend, so the observations may have to lie in one half of the model envelope for quite some time before it is anything really unusual. Especially if the forcing scenario (which is not just CO2) was a bit off (which is why Gavin Schmidt’s CMIP5 comparison also has a version that corrects for the forcing). Of course if you are looking to claim that the models are running hot, you can just ignore those irritating complications ;o)
DM, ATTP
Thanks, I think I get the idea.
I imagine an ensemble dedicated to predicting how many times “heads” will appear given a hundred coin tosses. If the ensemble mean is 50, modellers have done a good job. If we were to observe a hundred coin tosses, though, heads might appear just 44 times. At which the dimmest of skeptics might say, “look, the models are running hot. Not to be trusted!”
Snape,
Yes, that sounds like a reasonable analogy.
Snape –
Thanks, That analogy helps me understand (I think) .
Regarding your revised ensemble, The way it’s perfectly symmetrical seems highly contrived to me. And you are choosing to look at the ensemble by first deciding what the outcomes of the models will be – rather than designing the models, running them, and then looking at their results as an ensemble.
I have another question, which I think may have been essentially answered by Dikran.
In your model ensemble, the mean (20) is closer to a larger number of model results than, say, the model that returned 40. In such a case, would you say that each of the model runs was equally likely as the mean?
And btw, since I hate the Patriots I’d never bet on them regardless.
“How, if each outcome was equally likely, would this help me with my wager? How would it reduce my uncertainty?”
The purpose of a model ensemble is to capture the uncertainty, so that it can be properly propagated through in the cost/benefit analysis used to determine the course of action. There is more to decision making under uncertainty than merely reducing the uncertainty and/or neglecting it.
Snape – yes something like that – but ISTR victor gene a wrote a very good blog article on why the model spread doesn’t represent all of the uncertainties, which is well worth reading.
Snape –
That’s what I was struggling to convey.
Flipping coins and betting on sports point spreads is soooo 20th. century thinking. AI and machine learning is the future!
https://www.theverge.com/2019/6/25/18744034/ai-artificial-intelligence-ml-climate-change-fight-tackle
I would also point out that the winner of last year’s Sub-Seasonal Climate Forecast contest was by a team using AI that correctly predicted precipitations patterns for western N. America a year in advance.
https://www.whoi.edu/oceanus/feature/a-rainfall-forecast-worth-its-salt/
Jack
From your link, “The authors of the paper — which include DeepMind CEO Demis Hassabis….”
DeepMind created AlphaZero, an AI chess program I mentioned a while back. Really impressive!
https://en.wikipedia.org/wiki/AlphaZero
****
@DM, Joshua
“The purpose of a model ensemble is to capture the uncertainty”
That’s helpful, thanks.
Speaking of climate models,
“The Best Estimate Yet of the Impact of Global Warming on the Pacific Northwest”
https://cliffmass.blogspot.com/2019/06/the-best-estimate-yet-of-impact-of.html?m=1Cli
Snape,
I find it moderately interesting that Cliff Mass spends a good deal of his time critiquing supposed alarmism and now runs some of his own models and highlights that there will be substantial warming in that region if nothing is done about greenhouse gas emissions.
I find it moderately interesting that Cliff Mass can imagine lots of reasons RCP8.5 is pessimistic (including fusion!!), but omits to note any possibilities in the other direction.
I’m also personally very sceptical of the ability of climate models to be produce meaningful regional results with the downscaling methodology proposed. I strongly suspect there are many other potential outcomes which are quite different.
Snape, let’s take your coin-toss analogy. Assign a value of 1 to heads and 2 to tails. You want to know if your coin is fair, with an average score over a large number of tosses being 1.5. The ensemble mean after 100 tosses of your coin (let’s call it a specific climate model) would be (close to) 1.5. Expecting the single realisation of the Earth that we’re living on to match the ensemble mean is like expecting the perfect coin to score 1.5, not 1 or 2.
That is, expecting the perfect coin to score 1.5 in a single toss. Neither heads no tails.
I wonder, Snape, if you’re thinking about the spread of realisations as if it was an uncertainty envelope, error bars if you like due to uncertainty in the value of some physical constant or other parameter. If we’d perhaps input a range for each input into a Monte Carlo simulator and sampled the outcome. It’s not. Within a model, it’s natural variability due to chaotic, cyclic and other processes which exist outside of (but can interact with) the forcings. Like simulating weather but at a coarser scale.
Uncertainties in physical properties and their parameterisation is the realm of inter-model comparisons, one ensemble mean to another.
ATTP
Yeah, Cliff Mass is hard to figure. His criticisms are usually on target, but almost always aimed at progressive targets. The bias can be infuriating. He’s not a denier, though, and his blogs are a wealth of information, so myself and others similarly minded have learned to put up with it.
Dave Geologist
I don’t think your analogy fits. You’re giving me two options, and saying I need to pick one or the other.
Think of 3 doors, and behind one is a new car. Doesn’t matter which one I pick, right? I obviously wouldn’t choose the middle door just because of its central location.
That’s the wrong way to look at it. Instead, imagine each door is where a model run is forecasting a hurricane to make landfall. Here we’re trying to guesstimate a general location, where being close is nearly as good as being perfect. A whole different ball game.
With this in mind, let’s say two of the doors, A and B, are just a mile apart, while the third door, C, is a hundred miles away. Whose homes do you think are in greater danger from the hurricane, those located near A and B, or those located near C?
Then again…….
If out of 20 model runs, 10 are forecasting a hurricane to make landfall in South Florida, the other 10 in North Florida, then the ensemble mean would be directed at Central Florida, where NONE of the runs think the hurricane will strike. This matches Dave G’s analogy.
Ok, one more thought. Thinking out loud and not too sure of myself.
If I wanted to make the best forecast as to where a hurricane might make landfall:
a) create a cone of uncertainty around each individual model run. Probably quite large.
b) base the prediction on where the cones most overlap.
c) this will usually, but not always be the mean.
d) possible in some situations for the amount of overlapping to be the same across the whole ensemble spread, in which case the guess might as well be random.
> Think of 3 doors, and behind one is a new car.
I want goats.
Willard,
my hometown hired a lonely goatherd and about 80 goats in hopes of taking care of the blackberry bushes that had become overgrown in one of our parks. After two weeks and considerable expense, they had cleared what a city worker could have done in half a day using a mower. Maybe they didn’t like all the thorns?
******
In more recent news, our legislature passed a bill to limit carbon. It had enough Dem. votes to pass in the Senate, but with a loophole – two Republicans had to be present. They all skipped town.
Governor Kate Brown ordered state troopers to go out and find them, bring back two, in handcuffs if necessary, so the bill could pass. The Senators fled the state (out of the troopers jurisdiction) and are rumored to be fishing in Idaho.
All models are wrong, Snape, but some models are useful. For some purposes.
The ensemble of simulations within one model is useful for two purposes. One, to represent the range of year-to-year variation we can expect. People do care about how hot a hot year will be and how cool a cool year will be, not just the 30-year average. Two, to generate an ensemble mean which represents the long-term trend given a particular set of forcings. We don’t just care about how much hotter a hot year will be than a cold year, but also about how much hotter the average year will be in fifty years time. GCMs serve both those purposes. Having a range of models and comparing then in CMIP identifies and, to some degree, quantifies the structural uncertainties in the model, which exist independent of the chaotic or cyclical variation year-on-year.
Expecting a GCM to predict the landfall of a particular hurricane is like expecting a 40-ton truck to win the Indy 500. For that you need a racing car, or in the hurricane case, a weather forecast model covering a smaller area with much higher resolution and more sophisticated water-air interaction, augmented by satellite and aerial observations.
Pertinent to the original question, my coin analogy is precisely analogous. The Earth is one iteration of the real climate, with one El Nino and AMOC history, and it’s own gazillion butterfly wings. It can no more be expected to match the ensemble mean than a coin can be half head and half tails, or a dice throw a 3.5.
> my coin analogy is precisely analogous
I think it’s basically the same thing, except perhaps that most climate events don’t follow a normal distribution, and that the number of sides on the dice is growing:
https://initforthegold.blogspot.com/2010/08/when-it-rains-it-pours.html
There’s also the fact that the numbers are less easy to recognize.
I have a slightly longer-winded version of the dice analogy Willard (what me, long-winded? 😉 ). If we assume that an extreme heatwave is 12°C warmer than a normal summer, we used to be playing with two dice with sides labelled 1 to 6. We’d expect such a heatwave once every 36 years, say once a generation, or three per century
One of the dice is now labelled 2 to 7. A 12 has now moved to once every 18 years, and we have an event once per generation that no human has previously experienced. We’ve only increased the maximum score by one in twelve, or 8%, and yet the frequency of extreme heatwaves has tripled.
Even if we meet the Paris target, by 2100 both dice will be labelled 2 to 7. Our chance of throwing 12 is now one in twelve, about once per decade. A 13 is now once per couple of decades and an unprecedented 14 once per generation. We’ll be seeing previously unprecedented 13s or 14s with three times the frequency we used to see 12s. And what we used to consider extreme, one year in six or a couple of times a decade. Once-per-generation events will be the new normal (in the sense that you can evacuate, rebuild, whatever once per generation; but if it”s happening every few years, it has to become part of your base lifestyle). And yet we’ve only increased the maximum score by one in six, or 17%.
A ten percent plus-or-minus increase in the baseline increases the frequency of rare events several times over. And because the impact of generational and unprecedented events is likely to increase non-linearly, the damage could easily increase by an order of magnitude or more from that baseline shift. That’s how statements like “90% of the heat in this heatwave was due to natural variability and only 10% to AGW” can be consistent with “AGW made heatwaves of this magnitude three times more likely”
And of course it’s actually worse than that because of the fat tail of the extreme event distribution. 1°C can make a particular extreme event 3 or 4 times more likely.
Just in case I was being obscure (me, obscure?), by “I want goats” I was referring to the Monty Hall problem:
https://en.wikipedia.org/wiki/Monty_Hall_problem
Only simulations convinced me.
Willard
The Monty Hall question was my inspiration for the three-door analogy, so no, you weren’t being obscure. My favorite nuance…….if the host were to forget which door to open, but by chance still opened a door with a goat behind it, then your initial pick has a 50/50 chance of being the winner. Might as well stay.
DaveGeologist
I was using the hurricane landfall forecast in a generic sense, and didn’t make any claims about the skill of ECM’s.
“Even among single-model ensembles, a simple average of the individual model forecasts (the ensemble mean) often produces a more accurate forecast than any single model forecast because errors associated with the individual forecasts tend to be cancelled out.”
http://www.hurricanescience.org/science/forecast/models/modeltypes/ensemble/
Notice the word “often” is used rather than “always”. I was interested in what some of the exceptions might be in a GENERIC ensemble. When would an individual run, or group of runs, be as good a bet or even better than the mean? Hence the South Florida/North Florida oddity and the “two doors close to each other and one far away”.
Also, from what I’ve read, some of the models used to predict hurricane tracks, intensity, and so on, are ECM’s tailored for the short term. Where the focus is on initial conditions rather than longer term variables like ENSO and changes in CO2.
Sorry, GCM’s not ECM’s.
Yes Snape, that it another difference between weather models and climate models. Weather models tend to be initialised to current conditions, as reported by ships, weather stations, satellites etc. For example, if a hurricane can only form with SSTs above a certain value, and you already know that applies today, there’s no point starting with a climate realisation that has SSTs below the threshold value. Climate models are initialised long before the time interval of interest, and run until chaotic and random behaviour has eliminated all memory of the initial conditions. That’s a feature not a bug. We don’t know the initial conditions across the world, back when the only measurements were in England and Holland.
A hurricane can either hit or not hit. So the mean, half a hurricane, one-tenth of a hurricane or whatever, is not a useful forecast. A one in ten, one in fifty or whatever chance of a hit in a given year is a useful forecast when it comes to planning coastal defences. But not when it comes to evacuating a city.
“Only simulations convinced me.”
I think it is easier to understand the Monty Hall problem by switching it to the limit case of N doors, where N >> 3.
Eventually – after having it pointed out, in several variations – I saw that the 2/3 chance of the prize being behind one of the other two doors persists after one (of the two) that doesn’t have it is revealed.
I must be still missing something with respect to climate modelling – that whilst I see that the real world will follow it’s own track of ups and downs, pauses and accelerations and that means it won’t (except by chance) match with an ensemble mean, I still have some expectation its variability will be within the same range, centred around that mean. Which I would have supposed to align with where changes to the overall energy balance will put it. But I don’t know enough that I would expect it so much that I will not accept that others might know better than me and that I could be wrong.
“I still have some expectation its variability will be within the same range, centred around that mean. Which I would have supposed to align with where changes to the overall energy balance will put it.”
This is where the paleorecord gives insight, right? If we have an idea of the range of natural variability for say the 5000 years prior to the industrial revolution, then we have an idea of how far off climate models could potentially be from observations.
“Eventually – after having it pointed out, in several variations – I saw that the 2/3 chance of the prize being behind one of the other two doors persists after one (of the two) that doesn’t have it is revealed.”
But as mentioned above, the 2/3 wouldn’t persist if the host had forgotten where the car was. Still drives me nuts even after taking Dr. Pukite’s advice.
5000 years is too long Snape. We have forcing changes, perhaps including Early Anthropocene ones, and vegetational changes. It’s more appropriate to take a shorter range. Although, of course, for the simple yes-no answer “are the last few decades unprecedented in the last 5000 years, well outside natural variability”, we have dozens of hockey sticks which unanimously say “yes”.
The current realisation of Planet Earth currently falls well within the range of model iterations, so the models are fine. In fact, I tend to think the models have too much unforced variability in them, because I’d have expected the real world to pop outside the P5 to P95 range once a decade or so. But that may be a function of autocorrelation in the real world and the models. IOW if, by chance, we’re on a real Earth which is ticking along at the P70, say, it may take a long time for it to get out to the P95 . IIRC the correlation length in the various re-runs of Mann et al. (the ones that used real-world data, not the Mc & Mac effort with a correlation length several times that found in nature) was more than a year and less than a decade. So perhaps I should only expect an excursion once or twice per century.
I don’t understand how the range of natural variation informs on how far out climate models could be from observations. If you put in stupid values or parameters, models could be arbitrarily far out. Witness the various numerological models which fail miserably at forecasting, however “good” their hindcasts. And you could have zero variation, perfect stability, but models were out because of faulty input parameters, or numerical instabilities in the solvers.
“5000 years is too long Snape.”
You’re probably right. I threw that number out without much thought.
“I don’t understand how the range of natural variation informs on how far out climate models could be from observations.”
What I meant, but didn’t make clear, was that even if an ensemble mean happened to be spot on in its projection of forcing, observations will likely differ as a result of natural variation. So if 4C warming is projected by 2100 based on the forcing and feedbacks from a particular emissions scenario, it would be useful to know the range of natural variation that’s been observed over 80 year intervals in the “recent” past.
If, say, the largest natural fluctuation in any 80 year period during the last 1000 years was only 1C, that would be an important piece of information for scientists to have 80 years from now, in the year 2100, if they are comparing observations to what was projected back in 2020. If the observed warming was for example only +2C, it would follow that the models were at very least a degree too high.
Snape – without a climate model, how can you distinguish between natural variation and externally forced (from natural forcing) change in the observations. If you can’t do it without a model, why not just use the model spread?
“… that would be an important piece of information for scientists to have 80 years from now, in the year 2100,”
Aren’t we optimistic.
Might someone be willing to look at this, written by the YouTuber ‘Frolly1000’ (sporting a PhD these days) aka Dr. Holmes. A short reality based critique of his claims would be much appreciated:
“Thermal Enhancement on Planetary Bodies and the Relevance of the Molar Mass Version of the Ideal Gas Law to the Null Hypothesis of Climate Change”
April 2018 – Robert Holmes – Federation University Australia
https://www.researchgate.net/publication/324599511_Thermal_Enhancement_on_Planetary_Bodies_and_the_Relevance_of_the_Molar_Mass_Version_of_the_Ideal_Gas_Law_to_the_Null_Hypothesis_of_Climate_Change
“Presented here is a simple and reliable method of accurately calculating the average near surface atmospheric temperature on all planetary bodies which possess a surface atmospheric pressure of over 0.69kPa, by the use of the molar mass version of the ideal gas law. This method requires a gas constant and the near-surface averages of only three gas parameters; the atmospheric pressure, the atmospheric density and the mean molar mass.
The accuracy of this method proves that all information on the effective plus the residual near-surface atmospheric temperature on planetary bodies with thick atmospheres, is automatically ‘baked-in’ to the three mentioned gas parameters. …
,,, A new null hypothesis of global warming or climate change is therefore proposed and argued for; one which does not include any anomalous or net warming from greenhouse gases in the tropospheric atmospheres of any planetary body. “
citizenschallenge says:
These are joke arguments. The ideal gas law has three degrees of freedom — pressure, temperature, & density. So whatever the actual atmospheric science predicts by way of GHG physics, the three parameters will re-adjust so as not to violate the ideal gas law. So if mainly temperature and density is involved in the calculations, the pressure will re-adjust.
I think that is all there is to it, yet clovvns such as Holmes and Nikolov use it to bait the rubes.
Had a very brief look. If the model requires the pressure and the density and molar mass, then it is not surprising that you can use this to work out the temperature. From the gas law, PV = nRT, then if you know the density you can work out V (up to a multiplicative constant) as P is the mass of the atmosphere. If you warm up an atmosphere, the surface level pressure doesn’t change (as it is fixed by the mass of the atmosphere), so the scale height of the atmosphere increases and it’s density drops.
That the ideal gas law is a reasonable approximation to the behaviour of the actual atmosphere doesn’t sound too surprising.
Dave said:
The term autocorrelation as used casually makes it sound like it’s a nuisance effect. It’s actually the basis for understanding much of physics. Just a pet peeve of mine.
Thank you, that was helpful.
“Snape – without a climate model, how can you distinguish between natural variation and externally forced (from natural forcing) change in the observations.”
I watched the solar eclipse of 2017. There was an abrupt drop in temperature. I didn’t need a climate model to tell me it was the result of external forcing (as opposed to internal variation).
Snape, why evade a simple direct question in that way? Seems rather similar to what happened last time we tried to discuss climate. I don’t think I’ll bother from now on, life is too short.
Dikran
You misjudged me on both occasions. I’ve never tried to evade a question about science, and freely admit it when I realize I’m wrong about something.
To your comment yesterday, I didn’t fully understand what you were asking, so it didn’t seem at all simple or straightforward. My response was the result of a guess, “is this what he’s saying? Probably not, but I’m sure I’ll find out”.
****
* I asked my wife to put time restrictions on my internet use….almost up for the day in case you change your mind about not interacting with me.
Yawn. I wasn’t born yesterday.
The reason why I asked is that the model spread is itself an estimate if the effects of internal variability, and hence a estimate of how “wrong” the model mean is likely to be. Next, if we want to estimate natural variability from observations, for which we may not even have good estimates of the forcing, how do we differentiate between temperature variations due to changes in the forcings or internal variability? This is at the heart of a lot of the “it’s climate cycles” arguments, and essentially it can’t be done without a model and an estimate of the forcings.
So it was a simple and straightforward question, the problem for you would be that you can’t answer it without admitting that your suggested approach is no better than the one climate modellers already use.
If you really didn’t understand the question you could always have asked for clarification, rather than an unhelpful one-line flippant irrelevancy.
RealClimate.org organizes their discussion threads by classifying them as Unforced Variations (science discussion) or Forced Responses (policy discussion). Yet one of the recent RC posts by Haustein is called “Unforced Variations versus Forced Responses”, which has nothing to do with the science/policy distinction and everything to do with attribution. I personally think that Unforced Variations shouldn’t exist since there is unlikely to be any significant climate (not weather) behavior that arises spontaneously. In other words, everything is forced at the climate level, either external to the earth or via internal mechanisms such as CO2 or volcanic activity.
Pp are you saying that in the absence of external forcing there would be no variability in ocean circulation? I don’t think that is true.
It’s essentially the argument for a limiting continuum. In the extreme limit of no forcing, the earth would need to stop rotating and the sun wouldn’t exist. So along the asymptotic continuum, with less forcing there would be less circulation and the variability of circulation would decrease along with it. The confounding factor in explicating the variability is that diurnal, annual, lunar cycles all play into this but can’t be easily deconvolved — so the thinking is that at least some of that might be spontaneous. I just don’t think it is spontaneous until the known forcing factors (i.e. the null hypotheses) are all ruled out.
PP I think we (and the IPCC) have different definitions of forcing
“”Radiative forcing is a measure of the influence a factor has in altering the balance of incoming and outgoing energy in the Earth-atmosphere system and is an index of the importance of the factor as a potential climate change mechanism. In this report radiative forcing values are for changes relative to preindustrial conditions defined at 1750 and are expressed in Watts per square meter (W/m2).”
Under this definition you can certainly have zero forcing and still have a Sun and a rotating Earth.
This is also a matter of continuous attribution (what we see is a combination of both things), so I don’t think null hypotheses are a good approach.
So, to re-phrase, with constant external forcing (your apparent definition) would there be no variation in ocean circulation?
I’m not quite sure the context of the discussion, but Dikran is certainly correct that trying to distinguish the externally-forced response from the response due to internal variability requires some kind of model.
DM said:
“So, to re-phrase, with constant external forcing (your apparent definition) would there be no variation in ocean circulation?”
A bucket of water is under continuous external forcing, from both gravity and the spinning of the earth, yet it does not spontaneously start sloshing or form eddy currents. But scaling that to the size of the earth, because of the Coriolis effect it will circulate to some degree. So the question is then what will cause variations in circulation and mixing. According to Munk & Wunsch (1998)
“Without deep mixing, the ocean would turn, within a few thousand years, into a stagnant pool of cold salty water with equilibrium maintained locally by near-surface mixing and with very weak convectively driven surface-intensified circulation. …. The winds and tides are the only possible source of mechanical energy to drive the interior mixing. “
So the chicken&egg question is what sets the winds in motion, since tides have an obvious source. This to me is a deep causal issue, since the atmosphere has common-mode forcing with the ocean, i.e. winds are caused by pressure differentials, and significant pressure differentials are set by we all know what.
An upper latitude lake will completely overturn each year due to the nodal solar cycle, so this is a clear variation in water circulation driven by a stationary external forcing.
I am trying to understand the input parameters to a comprehensive GCM model. Including these features does seem to be a rather recent development
https://agupubs.onlinelibrary.wiley.com/doi/full/10.1029/2012JC008170
That is not actually answering the question. Note it is a question that can be answered “yes” or “no”. I make a point of trying to ask questions of that type as they are the best way of making positions clear; however maybe that is a bug rather than a feature.
Dikran
“the model spread is itself an estimate if the effects of internal variability, and hence a estimate of how “wrong” the model mean is likely to be.”
So the idea I wrote about upthread, of needing to use past records to get a sense of how much the model mean could be off due to natural variation, was based on a misunderstanding. You’re right, the ensemble spread serves the same purpose.
****
“Yawn. I wasn’t born yesterday.”
You’re not as good a mind reader as you seem to think.
I said: “A bucket of water is under continuous external forcing, from both gravity and the spinning of the earth, yet it does not spontaneously start sloshing”
Except when a quake hits, like today in SoCal
Snape “You’re not as good a mind reader as you seem to think.”
Yawn. Sorry, I’ve seen this sort of behaviour too often. You evaded a reasonable question, gave a flippant answer and are now trying to evade acknowledging your actions. It isn’t even as if this were the first thread where you have done this, is it?
PP sorry, I’m not interested in evasive answers to straight questions. If you don’t want me to understand your position, fine.
Time to move on, everyone.
Pingback: 2019: A year in review | …and Then There's Physics
Pingback: Did a physicist become a climate truth teller? | …and Then There's Physics