Science communication

I’ve been away at a meeting for a couple of days and have been too busy to write any posts, or even think about what to write. Like Eli, however, I’m also still somewhat mystified by the consensus messaging wars. I had contemplated writing a post about how one thing that people who agree about a complex topic seem to argue about, is how to communicate the information and what is appropriate and what is not.

However, I came across this (H/T Andy Skuce) which pretty much says it all, so I’ll simply stop here and probably have an early night.

Credit : SMBC Comics.

TBH, I don’t really like consensus messaging either

I might have to give Dan Kahan some credit. Even though I’m not convinced that consensus messaging is toxic and polarising in general, there are certainly circumstances in which it can be, as I discovered – again – on Twitter yesterday. There appear to be some physical scientists who object quite strongly to its use and, to be quite honest, I have some sympathy with their views; I don’t really like it either.

I wish we lived in a world in which what was obvious to those working in a field, was immediately obvious to everyone else. I wish we lived in a world where all you had to do was explain science clearly and carefully, and everyone would understand it, accept it, and recognise its significance. I wish we lived in a world in which scientists who engaged publicly would always include the caveats to their chosen scientific position, put it into the overall context, and explain how their scientific views were regarded by their peers. I wish we lived in a world in which the media did its best to avoid false balance and aimed to highlight which scientific views were accepted by most and which were disputed by many.

However, we don’t live in such a world. What’s obvious to those who work in a field is almost certainly not to those outside that field. Given that an understanding of the consensus (or lack thereof) is an important part of the scientific process, it would seem that an important part of communicating science is also communicating the level of agreement about the general scientific position. At a fundamental level, this is all that consensus messaging is about; there is a strong consensus about the basics of anthropogenic global warming (AGW).

Of course there might be secondary effects, and part of the research into consensus messaging is to study these secondary effects. If people accept that there is a consensus do they then accept that we’re warming and, if so, that it’s mostly us? If they accept this do they then accept the need for action and understand the need for climate policy? The latter issue appears to be one reason why some dislike consensus messaging; they seem to see it as inherently political. However, I think there are nuances here. We all do research and publicise it in order to inform. If our research is policy relevant then clearly it could be used by those who have political agendas. There’s nothing fundamentally wrong with this; informing the public, policy makers, and those who might influence policy is important.

As researchers, our obligation is simply to be open, honest, and transparent about our research, to explain it clearly and thoroughly, and to make it as difficult as possible for it to be mis-used. However, we can’t prevent people from using it to advance their policy preferences and we are not responsible if people misrepresent it in order to do so. So, I think that those who criticise consensus messaging should be careful to distinguish between the research that aims to illustrate the level of consensus and to understand the effectiveness of consensus messaging, and how it is used publicly; especially as they appear to be those whose research is also policy relevant. They might also want to consider that criticising its public use is inherently political, whether they like to admit this or not.

So, yes, I don’t really like consensus messaging either. I wish we could simply focus on communicating the science itself and, as a scientist myself, I do think this is crucial and important and is what we, as scientists, should be prioritising. However, I don’t see how we can do so effectively if people don’t at least understand the level of consensus with respect to the basics. I certainly regard consensus messaging as an attempt to make communicating the science easier, not as something that replaces good science communication; it’s intended to be complementary. I think it’s unfortunate that some seem to object so strongly to its use, especially as some seem comfortable saying things that they would regard as highly insulting if aimed at their research area.

Consensus messaging – again

I’ve written before about my views with respect to consensus messaging. It seems to be a topic that divides opinion, but there was a recent paper suggesting that

perceived scientific agreement is an important gateway belief, ultimately influencing public responses to climate change.

Dan Kahan, however, has apparently found a serious problem with this paper and has pronounced that

There are indications that Dan Kahan may well have found a genuine issue with this paper. However, he appears to have been rather critical of this work, and of the authors, for quite some time. In the latter of his posts, though, it took me a while to – I think – convince Dan that he was misrepresenting what they had asked in their survey. It’s maybe good that he might now have found a genuine issue; if you look hard enough…..

As far as I can tell Dan Kahan particularly dislikes consensus messaging. The reason seems to be that it is polarizing and toxic and aims to make certain groups seem stupid. Consequently he seems to regard it as ineffective and, quite possibly, doing more harm than good. On the other hand, if trying to make others seem stupid is an ineffective way to convince them of your position, then – based on my brief interactions with Dan – he clearly has no interest in convincing me of his position.

I don’t really know if consensus messaging is effective, or not, but there are others – such as Lawrence Hamilton – who seem to indicate that it might be effective in certain circumstances. I also might take Dan Kahan’s criticism of consensus messaging more seriously, if didn’t appear as though he was desperately keen to find problems with studies that do indicate an effectiveness, and wasn’t so openly insulting of those who undertake this work. I’d also take it more seriously if those who latched onto his claims weren’t also those who seem to most dispute the consensus position with respect to AGW.

The latter point is one reason I’m not convinced that consensus messaging is so ineffective. If it is so ineffective, why do those who seem to most dispute the consensus position seem so gleeful whenever anyone criticises consensus messaging. Surely if it is ineffective, those who don’t want the public to accept the consensus would want it to continue being used? Okay, that’s maybe not a very good argument, but it does seem strange.

My own position is really quite simple, as might be expected of a physicist who doesn’t claim to understand what might, or might not, work.

• If one considers relevant experts, or the relevant literature, there is clearly a strong consensus with respect to anthropogenic global warming (AGW). Essentially, that there is a consensus is true.
• If one is going to undertake some kind of messaging, it should be based on something that is true.
• Arguing against a messaging strategy that is based on something true, seems – to me – a little odd. Maybe those who do so should be doing so in a constructive manner and be presenting a viable alternative that is also based on a truth?
• If a messaging strategy based on a truth is indeed polarising and toxic, maybe this is – in itself – interesting; why does promoting a truth end up being divisive?
• How do we get people to accept the reality of AGW if we shouldn’t highlight the level of agreement amongst relevant experts and within the literature?
• If we do downplay, or ignore, the existence of a consensus, won’t that allow minority views, that are not regarded as credible, to be given more credence than is warranted?

To be clear, I don’t know if consensus messaging is effective, or not, but people I respect argue that there is evidence for its effectiveness. Also, I’m certainly not suggesting that consensus messaging should be the be all, and end all, of how we communicate this topic; my own preference would be to try and explain the science more thoroughly, but that’s apparently failed too. My basic issue is with the idea that we should avoid a strategy that is based on something that is true.

Posted in ClimateBall, Comedy, Global warming | | 171 Comments

Thousands of exoplanets!

I was quoted in the newspaper today. One problem with talking to journalists, is that you don’t always know quite how they’re going to represent what you said, or – even – if you’re going to end up having said something silly; you don’t get much warning and you, typically, don’t get a chance to proof read what they end up writing. This article, however, seems fine; I’m not sure if I actually said what I’m quoted as saying, but it seems pretty close to something I would have said.

The article itself is about the recent announcement, by NASA, of 1284 new exoplanets. Just in case anyone doesn’t know, an exoplanet is a planet in orbit around a star other than the Sun. These new exoplanets were discovered by NASA’s Kepler satellite, which uses the transit method. The transit method basically works by staring at as many stars as possible (150000 in the case of the Kepler satellite) and trying to find those that show periodic dips in brightness. This would indicate something passing in front of the star. The relative dip in brightness can then be used to infer the radius of this object, and the period can be used to infer its distance from the star.

One problem with this method is that there can be lots of false positives; there are many things that aren’t planets that can cause what appear to be periodic dips in a star’s brightness. However, the Kepler data is so exquisite that they can rule out many of these false positives. That’s what’s happened here. These new exoplanets were amongst many candidate exoplanets detected a few years ago. The analysis now indicates that these 1284 candidates are almost certainly exoplanets, and hence have been announced as such.

This gives me an opportunity to discuss some of my own research. As the article says, I’m part of the HARPS-N consortium. Although the transit method has been extremely successful, it essentially only allows one to determine the radius of the planet and its distance from the star. If there are multiple planets, one can sometimes infer the planet masses from the timing of the transits, but this doesn’t work for all systems. However, in a planetary system, the star and planets all orbit the common centre of mass. This means that at some times the star will be coming towards us, and at other times away.

Telescopio Nazionale Galileo

HARPS-N is a high-resolution spectrograph, part built in Edinburgh and located on the 3.6m Telescopio Nazionale Galileo. What it does is measure small shifts in the star’s spectrum which can then be used – via the Doppler effect – to determine the star’s radial (line-of-sight) velocity:

$\dfrac{\Delta \lambda}{\lambda} = \dfrac{v_r}{c},$

where $\lambda$ is the rest-frame wavelength of a specific spectral line, $\Delta \lambda$ is the shift in this wavelength, $v_r$ is the radial velocity of the star, and $c$ is the speed of light. From these small shifts in the spectral lines, you can determine the radial velocity of the star. If the radial velocity of the star shows periodic features, then one can infer that it must have companions (planets) and one can use this to infer the mass of these companions, their distance from their host star, and the eccentricity (or the circularity) of their orbits.

Credit: Motalebi et al. (2015)

The figure on the left shows the radial velocity curves for 3 rocky planets in a 4 planet system that we discovered last year. I should be clear that the radial velocity is that of the star, and each radial velocity curve has removed the contribution due to all the other planets in the system. The top two curves are quite sinusoidal and indicate that these two planets are on roughly circular orbits. The asymmetry in the bottom curve indicates that this planet’s orbit is somewhat eccentric.

Okay, this post is getting rather long, but we’re getting to the point I was wanting to highlight. If you look at the figure on the left, you’ll note that the radial velocity amplitudes are a few m/s. The spectral resolution of HARPS-N is $R = 115000$. This is the inverse of the smallest relative wavelength change that can be measured by the instrument

$R = \dfrac{\lambda}{\Delta \lambda}.$

If you look at the formula for the Doppler shift that I included above, you can relate this to the spectral resolution through

$R = \dfrac{\lambda}{\Delta \lambda} = \dfrac{c}{v_r} \Rightarrow v_r = \dfrac{c}{R}.$

If HARPS-N has $R = 115000$ and $c = 3 \times 10^8 m s^{-1}$, then $v_r = c/R = 2609 m s^{-1}$. Hmmmm, if this is the smallest radial velocity that we can measure, how can we have measured radial velocities of only a few m/s? The reason is that we measure across a wavelength range (383nm – 690nm) where there are lots and lots and lots of spectral lines, and then we cross-correlate with the known spectrum of the type of star we’re observing. The peak in the cross correlation function then gives the wavelength shift, from which we can determine the radial velocity of the star. You then need to repeat this a number of times (maybe 30 to 60) over the course of a year or so, to then produce the radial velocity curve from which you can determine if there is a companion planet, and – if there is – the properties of that planet.

So, even though we can’t directly determine the shift in individual lines, we can still determine the wavelength shift and – hence – the radial velocity of the host star. Given that we can’t actually see the shifts directly, how can we be confident that what we’ve measured really is indicative of a companion planet? One way is that different teams observe the same system and get the same result. Another is that some of the systems we observe are Kepler targets that are already known to probably host planets. The radial velocity results for those systems are consistent with what is already known from the transit measurements. Finally, and this applies to the 4-planet system I mentioned earlier; some of those detected via the radial velocity measurement are then found to also transit their host stars. Again, the results are consistent.

Mass-radius relations for exoplanets (Motalebi et al. 2015)

Maybe I’ll finish by pointing out another reason why combining radial velocity and transit measurements can be so powerful. The radial velocity measurements give the mass of the planet, while the transit meaurement gives its radius. Together they give the density, from which one can infer the internal compostion. The figure on the right shows the mass-radius relation for a number of known exoplanets, including Kepler-78b (K78b) which our team characterised a few years ago and is still the most similar – in composition and size – to the Earth, and HD219134b, one of those shown in the radial velocity figure above.

What’s clear is that there are a number of known exoplanets with compositions that appear very similar to the Earth. However, to date, these are all planets that are very close to their parent stars and, therefore, are almost certainly far too hot to host life. To date, we do not know of any genuine Earth-like exoplanets, in terms of composition, size, and distance from a star similar to the Sun. This is one reason why I think we have to be careful when talking to journalists about this topic. It’s easy to make them think that we’ve found something Earth-like and, hence, habitable, when really it is simply a rocky planet with a composition similar to that of the Earth, but almost certainly too hot to harbour life. For the moment, I would be very cautious about accepting any claims of having found a habitable, or even potentially habitable, planet. In 10-20 years time, though…….

Posted in Personal, Science | | 43 Comments

ECS ~1K?????

There is a new paper that is being somewhat uncritically accepted at the new Climate “Skepticism” site (and, yes, the inverted commas are necessary) and at Bishop Hill. It’s by someone called J. Ray Bates and claims to estimate climate sensitivity using two zone energy balance models. The paper essentially concludes that the method used

give low and tightly-constrained EfCS values, in the neighbourhood of 1°C

which is slightly bizarre.

Fortunately, Andrew Dessler has already provided a rebuttal to an earlier version of the paper. Essentially, the new paper is based very heavily on Lindzen & Choi (2011) which has been heavily criticised. Lindzen & Choi (2011) used sea surface temperatures and satellite measurements of the TOA flux to try and determine the feedback response. They essentially concluded that the non-Planck feedbacks were 0, or negative. However they only considered the tropics (20S – 20N) and then assumed that the non-Planck feedbacks everywhere else where 0.

The new paper tries to present a slightly more sophisticated model in which it considers two zones, but given what it assumes for the non-Planck feedbacks, it’s no surprise that it returns an effective climate sensitivity (EfCS) of about 1oC; that’s close to the value for the the non-feedback climate sensitivity. It’s fairly straightforward to illustrate why this result doesn’t make sense. Even though it is a two-zone model, it is still fundamentally an energy balance approach. You can write the basic energy balance formalism as

$N(t) = F(t) - \lambda dT + N(0),$

where $N$ is the planetary energy imbalance, $F$ is the change in anthropogenic forcing, $dT$ is the change in temperature, and $\lambda$ is the feedback response, which can be used to given the EfCS through

$EfCS = \dfrac{3.7}{\lambda}.$

Today $N(t) \sim 0.7 Wm^{-2}$, $F(t) \sim 2.3 Wm^{-2}$, and $dT \sim 1 K$. The main unknown is the initial planetary energy imbalance, but I’ve seen values between 0.08 and 0.15 Wm-2, so let’s use 0.1Wm-2. Putting these numbers in gives

$\lambda = - \dfrac{0.7 - 0.1 - 2.3}{1} = 1.7 W m^{-2} K^{-1},$

which then gives an EfCS of about 2K. Of course, these numbers are ballpark figures and there are uncertainties that should also be considered, but it is still hard to justify an EfCS close to 1K. Essentially, we’ve already warmed by almost 1K, have yet to double atmospheric CO2, and still have a planetary energy imbalance that is probably greater than 0.5 Wm-2. How, then, can the EfCS be 1K? I’ve emailed the author to ask him this question. I’ve yet to get a response.

The uncertainty on the mean

I wrote a quick post about Gavin Schmidt’s post comparing models to the satellite datasets. I thought Gavin’s post was very good, and explained the various issues really well. Steve McIntyre, however, is claiming that Schmidt’s histogram doesn’t refute Christy. This isn’t a surprise and isn’t what I was planning on discussing.

What I found more interesting was his criticism of a post Gavin wrote in 2007 (he also seems to have not got over the delayed publication of a comment in 2005). Nic Lewis also seems to think that Gavin’s 2007 post was wrong. So, I thought I’d have look.
In discussing a paper by Douglass, Christy, Pearson & Singer, Gavin says

the formula given defines the uncertainty on the estimate of the mean – i.e. how well we know what the average trend really is. But it only takes a moment to realise why that is irrelevant. Imagine there were 1000’s of simulations drawn from the same distribution, then our estimate of the mean trend would get sharper and sharper as N increased. However, the chances that any one realisation would be within those error bars, would become smaller and smaller.

In other words, in comparing the models and observations, Douglass et al. assumed that the uncertainty in the model trends was the uncertainty in the mean of those trends, not the uncertainty (or standard deviation) in the trends. This seems obviously wrong – as Gavin says – but Steve McIntrye and Nic Lewis appear to disagree.

The key point, though, is that we only have one realisation of the real world, which is very unlikely to match the mean of all possible realisations. With enough model realisations, however, we could produce a very accurate estimate of the mean model trend. Given, however, that the observations are very unlikely to produce a trend that matches the mean of all possible real trends, the model mean is very unlikely to match the observed trend, even if the model is a good representation of reality.

Gavin Cawley – who is also mentioned in Steve McIntyre’s post – discusses it in more detail here saying:

It is worth noting that the statistical test used in Douglass et al. (2008) is obviously inappropriate as a perfect climate model is almost guaranteed to fail it! This is because the uncertainty is measured by the standard error of the mean, rather than the standard deviation, which falls to zero as the number of models in the ensemble goes to infinity. If we could visit parallel universes, we could construct a perfect climate model by observing the climate on those parallel Earths with identical forcings and climate physics, but which differed only in variations in initial conditions. We could perfectly characterise the remaining uncertainty by using an infinite ensemble of these parallel Earths (showing the range of outcomes that are consistent with the forcings). Clearly as the actual Earth is statistically interchangeable with any of the parallel Earths, there is no reason to expect the climate on the actual Earth to be any closer to the ensemble mean than any randomly selected parallel Earth. However, as the Douglass et al test requires the observations to lie within +/- 2 standard errors of the mean, the perfect ensemble will fail the test unless the observations exactly match the ensemble mean as the standard error is zero (because it is an infinite ensemble). Had we used +/- twice the standard deviation, on the other hand, the perfect model would be very likely to pass the test. Having a test that becomes more and more difficult to pass as the size of the ensemble grows is clearly unreasonable. The spread of the ensemble is essentially an indication of the outcomes that are consistent with the forcings, given our ignorance of the initial conditions and our best understanding of the physics. Adding members to the ensemble does not reduce this uncertainty, but it does help to characterise it.

I should probably clarify something, though. If you want to characterise the uncertainty in the model mean, then of course you would want to use the uncertainty in the mean. However, if you want to compare models and observations, you can’t use this as the uncertainty, if the observed trend is not the mean of all possible observed trends.

To do such a comparison the standard deviation of the trends would seem more appropriate. However, even this may not be quite right, because as Victor points out the model spread is not the uncertainty. Typically, what is presented is the 95% model spread (5% of the models would fall outside this range at any time if the distribution were Gaussian). However, to take into account other possible uncertainties, this is typically presented as a likely range (66%), rather than as an extremely likely range (95%). Of course, if the assumed forcings turn out to be correct, and the models is regarded as a good representation of reality, then the model spread will start to approximate the actual uncertainty.

As usual, maybe I’m missing something, but it seems that the criticism of Gavin’s 2007 post is not correct, in the sense that Gavin is quite right to point out that using the uncertainty in the mean, when comparing models and observations, is wrong. This seems like another example of people with relevant related expertise making a technical criticism of what someone else has said, without really considering the details of what is being done, and doing so in a way that makes it hard for non-experts to recognise the issue with their criticism. Feature, not bug?

Update: It seems that some are arguing that the end of my post is wrong because a paper by Santer et al. did indeed use the uncertainty on the mean when comparing models and observations. However, Santer et al. also included the uncertainty in the observed trend in their comparison, which makes it more reasonable than what was done in Douglass et al., who did not include the uncertainty in the observed trend (this post is about a Gavin’s comment about Douglass et al.). However, having said that, I’m still not convinced that the Santer et al. test is sufficient to establish if models are biased, given that the distribution of the observed trends (i.e., trend plus uncertainty in trend) will not necessarily be equivalent to the distribution of all possible trends for the system being observed.

I realise that there might even be more confusion. Apart from when I quoted Gavin Cawley’s comment, the Gavin I’m mentioning is Gavin Schmidt.

Models versus satellites

A graph that is fairly commonly promoted to – apparently – illustrate that models and observations have diverged, is one produced by John Christy which compares models to satellite/balloon data for the troposphere. Ignoring all the potential issues with this graph, one problem is simply that it has never really been explained fully; until now that is. Gavin Schmidt has just completed a post on Realclimate called Comparing models to the satellite datasets.

Credit: Gavin Schmidt

I don’t really need to say much more, because you should really read Gavin’s post. What I did want to say is that it is a masterclass in how to present and discuss scientific evidence. It steps through all the different choices one can make when doing something like comparing model results and observations. I was, however, wanting to just highlight the figure on the right. Essentially you need to decide how to align your different model runs; do you normalise them with respect to a single year, some average over a number of years, or with respect to the trends. The figure on the right shows what happens if your baseline is 1 year (1979), 4 years (1979-1983), 10 years (1979-1988), or if you force the trend lines to all pass through the same point (the x-axis in 1979).

As you can see, there is quite a difference, both in terms of the apparent mean trend and also the 95% spread. Gavin’s argument (which makes sense to me) is that if you want to enhance the forced trend, you should average over a reasonable time interval so as to smooth out – as much as possible – the impact of internal variability. That would mean using the 1979-1988 baseline in the figure (pink). I’ll leave you to guess what John Christy chose for his graph. I’ll also stop there and encourage you to read thoroughly Gavin’s post