
Credit: xkcd.
So, how did they do this? They take proxy data (5 sites plus a multi-proxy for the Northern Hemisphere) and use spectral analysis to determine a set of sinusiodal variations that fit this proxy data. The output from this spectral analysis is then fed into an artificial neural network (a form of machine learning) which is then used to project the warming for the period 1880-2000 for the Northern Hemisphere and at the individual proxy sites. They find that the observed warming, since the mid-1800s, can mostly be explained as being a consequence of these natural fluctuations. The residual is then used to estimate the ECS, which they suggest is around 0.6oC.
Well, this is simply nonsense. It’s essentially just a complicated curve-fitting exercise. The average temperature of the Earth is largely constrained by energy balance. This, of course, does not mean that it can’t vary, but we do mostly understand what can cause these variations. There are internal/natural cycles that can produce variations, but there are limits as to how large these internally-driven cycles can be and how long they can last. On timescales much longer than a decade, or so, we would expect these to be small, otherwise it would indicate that our climate is much more sensitive to perturbations than we expect (exactly the opposite of what this paper suggests).
Long-term (multi-decade) changes in our climate are mostly a consequence of external perturbations; volcanoes, the Sun, emission of greenhouse gases, changes in ice sheets (typically a consequence of variations in our orbit). These are all rather complex processes and the idea that one could predict how they will change in future by fitting some sine curves to a few different temperature proxy records is rather ridiculous.
This highlights the key problem with the approach in this paper; you can’t try and understand what causes our climate to vary, or how it might vary in future, using machine learning alone. Even though our climate is complex, it is still a physical system and we do understand the underlying physical processes quite well. You do need to take this into account. The idea that (as the paper suggests)
[a]n alternative approach, as demonstrated here, does not require a prior understanding of the physical processes, but adequate data and appropriate machine learning techniques
is ridiculous. If you don’t consider the underlying physics, then you essentially know nothing about what’s causing the climate to change/vary.
That’s not to say that machine learning can’t play a role. However, if you are going to use something like machine learning to make predictions about the future, you do need to be pretty confident that the data that you use to train the machine learning algorithm presents a reasonable representation of the system you’re trying to model. This requires some actual understanding of the system being considered. If you change it by, for example, pumping lots of greenhouse gases into the atmosphere, then the training data will almost certainly not be appropriate.
Ultimately, if your naive approach – that completely ignores physics – produces a results that is inconsistent with our understanding of the physical system (suggesting, for example, that it’s almost all natural and that the ECS is about 0.6oC), then it’s much, much more likely that the machine learning algorithm is producing nonsense, than there being something wrong with what is essentially fairly basic physics.
Zeke’s just posted this tweet, which highlights a pretty fundamental problem (which I had missed). The NH proxy reconstruction in the paper only goes up by about 0.5oC since the late-1800s, while the instrumental record shows that it has warmed by more than 1oC. Pretty sure the machine learning algorithm would not return this.
Sounds like WUWT quickly has to change its mind on Machine Learning. Earlier this week it was still the work of the devil. https://archive.fo/nsYap
“Most information we receive about the real world is through electronic media, especially the internet. Internet information processing is increasingly influenced by artificial intelligence. Google, Facebook, Twitter, and Microsoft are open and boastful about their use of AI. …
These hybrid AIs interact in ways that have not been anticipated by their creators and might be not appreciated even now. Together, they form a single hybrid AI entity, which will be referenced here as the San Francisco AI. …
This paper shows that the San Francisco AI exists and explains how it operates and distorts the public’s perception of reality into an extremely leftward and anti-American(1) direction.“
Will wannabe geoengineers rush to quarry the SRM proxy data dumped on the globe by yesterday’s eclipse ?
http://vvattsupwiththat.blogspot.com/2017/08/none-dare-call-it-geoengineering.html
Gavin Schmidt has just pointed out that this Journal (GeoResJ)
Given what they seem willing to publish, this seems a sensible decision.
Deniers absolutely love curve fitting. See variations on the theme by Akasofu: https://www.skepticalscience.com/lessons-from-past-climate-predictions-akasofu.html
Easterbrook: https://www.skepticalscience.com/lessons-from-past-climate-predictions-don-easterbrook.html
Loehle & Scafetta had one of the best examples: https://www.skepticalscience.com/loehle-and-scafetta-play-spencers-curve-fitting-game.html
But considering the denial mindset, it makes sense. Climate changed naturally in the past. What caused it? Some sort of natural cycle. What caused those cycles? Erm….Nature? So just try to fit a cyclical pattern to the data – no physical justification needed, because ‘Nature!’ Then claim you’ve explained the warming to date, and that we’ll shift to a cooling phase of those natural cycles any day now. Yep, any day now, just you wait and see…
https://www.skepticalscience.com/comparing-global-temperature-predictions.html
Dana,
It’s basically a “anything but us” argument, but without any attempt to try and understand what this other thing could be.
Seems to be paywalled, but there is Marohasy’s blog:
http://jennifermarohasy.com/2017/08/recent-warming-natural/
It turns out that their figure was just a reprint of Moberg et al 2005 (though this was only discovered via the citation of their citation), which is a NH Land/Ocean reconstruction up through 1979. My figure was a tad inaccurate since I assumed it was land-only; here is what it should actually look like: https://twitter.com/hausfath/status/900130781615374336
“Curve-fitting” is (typically) a misnomer in my experience. Most climate papers that appear to fall under this heading are basically econometric or statistical exercises, all of which make assumptions about the underlying DGP (linearity, i.i.d. disturbance terms, etc.). It’s not that there’s no underlying model at play; quite the opposite. We’re making even stronger — albeit often unstated — assumptions about the relationships in the data. But this (again, typically) gets glossed over because of how easy it is to run a regression.
Having said that, machine learning really is an exception to the rule. These are “black box” in the truest sense of the words. (I say this as someone who is deeply excited about ML and finding increasing use for it in my own work.) The end paragraph of this post is exactly right: If your ML output appears to contravene known laws and established evidence, best to rethink its application.
FWIW, climate science isn’t alone in seeing junk science dressed up as ML sophistry: https://twitter.com/grant_mcdermott/status/894289094187941888
Machine learning covers lots of applications.
I would just LOVE to see the peer-review comments…
The single most important line in your piece ” you do need to be pretty confident that the data that you use to train the machine learning algorithm presents a reasonable representation of the system you’re trying to model”
“You need to know that the ML algorithm will build a model that almost perfectly predicts your training set. Garbage in; garbage out.”
@ Harry Twinotter, something like this…
Reviewer 1. Excelent paper! Accept as is.
Reviewer 2. Suggest to accept as is. Excelent paper!
Reviewer 3. This excelent paper may be accepted as is.
“you can’t try and understand what causes our climate to vary, or how it might vary in future, using machine learning alone”
” It’s essentially just a complicated curve-fitting exercise. ”
You are happy however to use machine learning to make Climate models?
And when the models, with the understood causes vary the problem is with our understanding of the causes, correct.
At least with models only half is a complicated curve fitting exercise.
The unpredictability of the known past would require major complicated curve fitting..
In addition to the fact that this is nonsense curve fitting, isn’t the “machine learning” moniker also a misnomer?
Doesn’t machine learning involve a feedback process where …
1. Inputs fed into models predict outcome
2. Outcomes are measured and tested and compared to predictions
3. Model is adjusted based on learnings
4. Go to 1.
This is not what these authors are doing surely; and if it was, I wouldn’t call it machine learning. ‘Model skill’ development, maybe. But they’d need to know a lot about the scientific basis of the phenomena to make those adjustments; random tweaking of the FORTRAN won’t hack it guys.
As Steve Easterbrook has commented about climate scientists who develop the software used in climate models (check out his interesting lecture based on research he led on this), they don’t thinking of themselves as software engineers, but as scientists who happen to use software as a tool in their job; along with knowing about fluid dynamics, thermodynamics, statistics and other stuff.
Grant,
Quite possibly. I was referring, though, to the, fitting a set of sine curves to each proxy dataset.
There’s quite a lot of people starting to use Machine Learning in Astronomy, so there certainly some exciting possibilities. This paper doesn’t demonstrate one of them, though 🙂
angech,
No, Machine Learning specifically refers to a
Machine learning doesn’t refer to simply using computers.
Richard,
Indeed, I’m not sure that they really followed what I understood to be maching learning. They had 6 proxy datasets from which they determined sets of sine curves (with different amplitudes, phases and periods). They then fed these into an artificial neural network to both reproduce the period over which they have the proxy data and project forward. I didn’t see where they actually trained, tested, and then adjusted their machine learning process. It sounds more like they simply projected the sinusoidal functions forward in time.
Zeke,
Thanks for the correction.
I’ve just read the first paragraph, and I suspect my morning (as a machine learning researcher) is likely to have been spoiled 😦 I’ll have a cup of coffee and read the paper, but alarm bells are ringing already. The hierarchy of reliability is
physics > statistics (which includes machine learning) >= chimps pulling numbers from a bucket.
Note the second operator is a >= rather than a > . If you have a statistical model that suggests physics that has been well understood for over a century is wrong, then chances are your model is wrong, not the physics.
Nothing wrong with a big pile of linear algebra though! ;o)
Dikran,
Maybe you can answer the question as to whether or not they actually used machine learning. It looks much more like they simply decomposed the proxy data into a set of sine waves (with different amplitude, phase, and period) and then simply project those functions forward in time. That doesn’t sound like machine learning to me, but maybe I’m missing something.
Neural networks and SVMs are certainly machine learning tools, will post my views when I have digested the paper properly. A very bad sign is that the cite Loehle’s obviously and fundamentally flawed paper “A minimal model for estimating climate sensitivity”, but not the comment paper that discussed the fundamental problems with this kind of approach (which admittedly had a mistake in it, but one that didn’t effect the validity of the criticism). The comment paper is not hard to find. Quite a good idea when judging the validity of a paper is to see who has cited it and what they wrote about it. Google scholar makes that quite easy, in this case. If you are going to cite a paper (such as this one) that disagrees with the mainstream scientific position, it is obviously poor scholarship not to test its validity in this way.
Pingback: Filles du Roy - Ocasapiens - Blog - Repubblica.it
Issues aplenty.
Machine learning (ANN) has nothing to do with what is presented here.
Whether it is a normal computer programme or a computer that can learn and adjust (hence think) the task set does not need any thinking done.
It is simply a computer run.
The problem is that this is precisely what happens when computer models do the warming scenarios.
So when you guys knock it, you shoot yourselves in the foot.
I, on the other hand, can find a lot to knock in this paper.
6 sites of tree rings going back a 1000 years (late holocecene?)
Data of this.
In computer program.
Then take the same 6 sites.
Input new but very closely matching data for 830 years (approximately simulated) and run projections for the next 130 years.
Wow. They have very low deviations from the originals!
Lucky they only did 6 runs (chaos theory and all).
2% variation when the computer program already has “true” or Amber model on it is pretty big when you consider if you put the original data in you would get 0% variation every time.
Still despite the terrible science aspect they copied one detail escapes me.
How do they deduce a low climate sensitivity?
Is it because the variation of the model was so close to that of the original data.
I like the work JM has done on BOM shenanigans in Australia but find this dreadful. Problem is as said, you guys really should be backing her in. She is using Consensus science here.
Still go ahead and shoot her down. Heck even the auditor would not back this one in
angech,
No, machine learning refers to a particular technique in which you essentially use data to “teach” a computer to give you a particular result. Climate models, on the other hand, are developed using equations based on well-accepted physical principles/laws. They are not the same.
No, this paper does not use “consensus science”.
Right, I’ve read the paper now (caveat: only once, so it is possible I have missed something important), and if anything it is even worse than the Loehle paper.
The fundamental problem with this approach is to do with omitted variable bias. In short, if the net result of independent changes in forcings (e.g. GHG, aerosol, volcanic etc.) happens to be correlated with one of the identified sinusoidal components then the action of the forcings will be misatributed to the sinusoidal component. This means that a cyclic model of this nature is inherently biased towards low ECS from the outset. This means that at best, such models should be used to make claims about a lower bound on ECS, e.g. “these models suggest that an ECS below 0.6C is inconsistent with what can be explained by sinusoidally varying components of natural variability”.
I am rather surprised this paper made it through review as the most important claim made in the abstract “The largest deviation between the ANN projections and measured temperatures for six geographically distinct regions was approximately 0.2 °C, and from this an Equilibrium Climate Sensitivity (ECS) of approximately 0.6 °C was estimated.” isn’t justified by the content of the paper AFAICS (i.e. how do they go from a largest deviation to an ECS estimate). For a start, it isn’t the largest deviation, but the largest average deviation (mean absolute errors, according to the paper) over the six proxies investigated. Note a mean average deviation does not imply the deviation has a secular trend that could be attributed to anthropogenic forcing, so there is an immediate disconnect there already. Secondly the proxies are mostly regional or sub-regional, so it isn’t clear how relevant they are to arguments about global ECS anyway, and differences between observed and modeled temperatures seem more estimates of TCS rather than ECS. The figure for ECS given in the abstract is essentially just a hand wave, AFAICS.
A common problem with machine learning is over-fitting, i.e. using an overly complex model, fitted to too little data, which is able to memorize the noise in the dataset, rather than the underlying form of the functional relationship between inputs and outputs, which leads to a model that gives unreliable predictions. There are methods to deal with this to a large extent, but there is also a problem with “experimenter degrees of freedom” or the “garden of forking paths”, where there are many design choices open to the experimenter, and by making these choices (or worse, optimising them), the experimenter becomes part of the machine learning optimisation procedure and that results in over-fitting the data. This is a real problem in machine learning as there is no real way of assessing the degree to which these choices may have affected the results. In this case, there is no explanation given as to why these particular proxy datasets were used. If I were being cynical, I could suggest that they were chosen to give the results most favourable to the desired conclusion (i.e. evaluating lots of proxies and a post-hoc choice being made of which to publish). This happens a lot in machine learning, which is why these days there is a tendency to evaluate algorithms over a sizable suite of benchmark datasets, rather than just a handful (and raise eyebrows when someone only uses a selection of datsets from the full suite ;o).
The machine learning approach does not appear to take sufficient steps to avoid over-fitting. I had quick look at the website and it was not at all clear that the software includes approaches such as regularisation to prevent overfitting. The software automatically optimises model choices, such as the type of model, the architecture of the model and the features selected for the model in order to minimise a cross-validation based estimate of performance. Unfortunately it is possible to introduce a form of over-fitting in model selection by making too many of these choices and over-fit the cross-validation estimator as well. This happens to be one of my research interests, and I have a paper that shows even apparently benign practices can introduce substantial amounts of over-fitting and bias. Software that automatically tunes models like this is rather risky in my opinion. Essentially optimisation is the root of all evil in machine learning, and it is better to marginalise (average over) sources of uncertainty instead.
Like the Loehle paper, this approach ignores the uncertainties involved in estimation from data. For a start the parameters of the sinusoids are treated as being exact, but small differences in phases, frequencies and amplitudes would probably give similar results, and this uncertainty should be propagated through the model so that the effects of that uncertainty were included in the uncertainty of the ECS estimate (as we did in the corrected version of Loehle’s model in the comment paper mentioned in my earlier comment). Similarly, there will be uncertainties in the estimates of the model parameters, but these are not included either.
The paper says “However, superior fitting to the temperature proxies are obtained by using the sine wave components and composite as input data. This was established by comparing the spectral analysis composite method versus the ANN method for the training periods. “. This is a classic rookie error, if you use nested models of increasing complexity, the more complex model will always give better performance on the training period, but it doesn’t mean that it is a better model. You need out-of-sample errors to determine that, and an out-of-sample comparison of the ANN and spectral composite models is absent AFAICS.
“The discrepancy between the orange and blue lines in recent decades as shown in Fig. 3, suggests that the anthropogenic contribution to this warming could be in the order of approximately 0.2 °C.” the eyecrometer really should not be used to make this sort of claim in a statistical methods paper!
Table 13 seems to suggest that paleoclimate studies give lower ECS estimates than GCMs. I though that was the other way round? Perhaps an unfortunate choice of studies?
I’m rather surprised the data were obtained by scanning images, rather than using the original data. Figure 7 in particular suggests that the data acquisition process might not be all that good in terms of resolution/accuracy?
For the 20th century, I don’t see why they don’t compare their model with instrumental temperature datasets rather than just stay with proxies. If the proxies are meaningful for estimation of ECS then the models should give meaningful output when compared to the instrumental data as well. Am I missing something?
The NH proxy shown in Figure 13 doesn’t look a great deal like the actual NH temperature data, e.g.
http://woodfortrees.org/graph/hadcrut3vnh/mean:60/from:1880
although I don’t know enough about proxies to know whether this is a reasonable comparision, however for ECS, surely we need to be using the instrumental data for a period where we actually have it?
Anyway, those were my thoughts on the paper. It is machine learning (which is basically a branch of computational statistics, so simple statistical model fitting is machine learning, especially for more complex models such as ANN), but not a particularly good example of machine learning (IMHO).
Dikran,
Thanks, a very useful comment.
Indeed, Gavin Schmidt also highlighted this on Twitter.
angech wrote “The problem is that this is precisely what happens when computer models do the warming scenarios.”
No angech, it isn’t. A GCM is not a machine learning approach, it is what you get if you chop the world into cubes and apply the laws of physics to the exchange of energy, air, water etc, between the cubes. Of course it is possible to use machine learning to implement some of the parameterisations within the models (that deal with things that happen on a scale too small to be resolved at the scale of the grid), but the models themselves are physics, not machine learning, they are not adaptive.
After trying to teach an astronomy prof about Jeans instability, I do hope you are not going to try and teach a machine learning researcher (I’m not a prof under the U.K. definition of the term) about what machine learning is ;o)
“Still despite the terrible science aspect they copied one detail escapes me.
How do they deduce a low climate sensitivity?”
It is not often I agree with angech, but on this ocassion he has spotted the main problem with the paper, namely the ECS estimate is an hand-wave with no technical justification given in the paper, AFAICS.
“She is using Consensus science here.”
of course angech had to go and spoil it with the usual bullshitting. There is no “consensus science” in this paper, quite obviously as the estimate of ECS given is an obvious outlier, even by the standards of existing outliers (e.g. Loehle!).
ATTP ouch! If a non-specialist like me can see there is something wrong with the diagram, you have to wonder about the reviewers, who were presumably domain experts (I don’t think they were machine learning experts given the clear deficiencies of the paper in that regard).
The first semester applications of machine learning (NN, SVG and the like) ask the machine to minimize the error in some prediction function. Two archetypal problems are prediction of home price and identification of handwritten characters, ones for which the researcher has perfect knowledge. WHAT the machine learns needs to be defined at the the start. HOW it learns and what factors are important may remain a mystery to the researcher. It is a black box, at least at first.
Since graph paper has been printed. there has been a group devoted to “technical analysis,” an activity I liken to reading tea leaves. It ignores the possibility of confounding variables. When you use ML, but don’t have confounding variable data available, the algorithm does the best with what it has. The prediction may be right, but the researcher has no better idea of the mechanism at the end than at the start. Without a mechanism to describe the process, a perfect model is only that. One shouldn’t count on it for prediction.
You don’t need to be an expert in ML to recognize this. We simply need to return to first semester stats. Overfitting the data gives “perfect” results that are meaningless for prediction.
If the training set doesn’t have a well defined and correct set of data for the value of the function you are trying to predict, the whole effort is for naught. The line from above, “computers the ability to learn without being explicitly programmed,” misses the point that the researcher needs to tell the machine WHAT to learn. Else with a whole pile of data, the conclusions will be pointless. The algorithm will discover things that are obvious or predictable from known mechanisms, or worse, specious correlations, like “Spending on science correlates with suicides by hanging.”
The Abbott and Marhosy paper is available free. What troubles me is that there is no means to verify the results. This type of work needs sufficient named data sets and code to allow others to reproduce the work.
I think there is probably a lot of scope for machine learning in climate modelling, I watched a very interesting video lecture by Ian Vernon the other day on “Bayesian History Matching of a Galaxy Formation Simulation”, where a Gaussian process model is used to emulator to efficiently optimise the parameters of an expensive galaxy simulator to produce observed Galaxy shapes (or more generally explore the parameter space).
It ocured to me it would be interesting to do the same with climate models as a cheap method of performing “perturbed physics experiments”, if it hasn’t already been done.
I’d certainly second Richard’s recommendation of looking at Steve Easterbrooks research, ISTR a very good video lecture he did on his experiences working with climate modellers on the software engineering of GCMs (and why hiring software engineers is unlikely to make things any better).
Machine learning is more a branch of AI than of computational statistics.
”
Signal analysis was undertaken of six such datasets, and the resulting component sine waves used as input to an artificial neural network (ANN), a form of machine learning. By optimizing spectral features of the component sine waves, such as periodicity, amplitude and phase, the original temperature profiles were approximately simulated for the late Holocene period to 1830 CE. The ANN models were then used to generate projections of temperatures through the 20th century.
”
In picture form:
My paradigm-soaked eyes see an elephant wiggling his trunk.
Interestingly, when I run the proposition:
”
There is obvious merit in attempting to construct physical models to understand simulate and forecast climatic phenomena.
”
through the ANN, I get:
”
My hovercraft is full of eels.
“
What is most surprising here is that these models appear to have any skill at all in predicting the post-1830 data.
If one made the assumption that the authors did the analysis honestly (so didn’t just discard proxies that didn’t work or keep fiddling with the analysis) then the reasonable correspondence in the graphs shown (assuming they haven’t been too selective here) suggests there really is some quasi-cyclic behaviour. I wonder how well these would work on coloured noise. I guess the key to getting reasonable looking results is to use a short enough test period and to use trendless low-pass filtered data (so you only see roughly ‘one wobble’).
Still not sure what role the ‘machine learning’ is playing here apart from looking for a slightly better multi-sinusoidal fit. Seems like pretty awful overfitting on the test data.
The claim that climate sensitivity can be determined from this is really nonsense though.
Ben,
A few things to consider. Most of their proxies are for individual sites, which will show much more variability than the entire globe. Also if you consider Zeke and Gavin’s tweets (Shown in this comment and this comment) then I don’t think they have done their analysis properly. They’ve ignored most of the modern warming, and misaligned the multi-proxy data (it really goes to 1965, not 1990).
Yes, just to be clear, it only looks like to me they have some skill in predicting the time-variation of these (time-shifted) proxies.
I’m not suggesting they’ve managed to show that the global temperature record actually has cyclic behaviour (especially as a model for evolution over the last 50 years). Their selection of data is, as already pointed out, completely inappropriate for this task.
“What is most surprising here is that these models appear to have any skill at all in predicting the post-1830 data.”
yes, I was mildly surprised by that, however I don’t think it is necessarily an indication of real cyclic behaviour. For instance the forcings over the 20th century combine to give an oscillating pattern (AIUI some solar forcing in the first half of the century, some aerosol driven cooling mid-century, followed by GHG forcing really emerging from 1970 or so). Statistical methods cannot reliably distinguish whether ECS is low and internal variability causes what we see, or whether ECS is high (and it is the forcing that drives what we see) and internal variability is low. This is the problem with correlations, the can be spurious, and in this case for the oscillation to be the cause what we know about radiative phsyics must be seriously wrong (affecting solar forcing as much as GHG forcing).
Actually if your cycles are related to solar variability then you need to explain why sensitivity to solar forcing is high, but sensitivity to greenhouse gasses is low.
The problem for the experimenter is that we have already seen the data we are trying to “project” (post 1830), so any analysis is necessarily post-hoc, which may (unconciously) affect things like the choice of proxy.
The advantage of physical models is that we know why the models are predicting what they predict, and more than just correlation, they can plausibly explain the strength of the effect. For cycles based on solar cycles the big problem for low ECS is that solar forcing is too small (or in the wrong direction) to explain recent warming.
“It ocured to me it would be interesting to do the same with climate models as a cheap method of performing “perturbed physics experiments”, if it hasn’t already been done.”
I think you are describing climate model emulators.
Even for parameterisations of small-scale processes in climate models the use of machine learning is very rare, normally parameterisations are based on physical relationships. Artificial Neural Networks have been proposed to speed up radiative transfer parameterisations, but I do not know of any operational models using them. A bit outside of climate modelling, artificial Neural Networks have been used a lot for research, especially in hydrology, but not implemented much in operational hydological models.
My PhD student worked on using Machine Learning (Genetic Programming) for downscaling the atmospheric fields when coupling it to a higher resolution surface model. Currently that is for a process model for a small river catchment, but one day that may become part of a climate model. That would then be the only example I know of. The nice thing of Genetic Programming (GP) is that the output is a small piece of code, which you can study and see if it makes physical sense. It is not a black box like a neural network. So GP is a bit of a twitter between physics and machine learning.
If they used the complete interval to determine the main sine wave periods, and then input these on the training interval, they could have a chance of a good match in the test interval. The question is whether their procedures are completely blind, or did they leave an eye open and peek?
If it was a tidal analysis, they would be forced to use only specific periods, but here they have free reign.
geoenergymath,
I wondered the same. If they’ve already determined the periods and amplitudes of a set of sine waves from their training data, it’s not that surprising that when the put his into a neural network that the result then matches their training data. Doesn’t sound all that difficult a task.
Victor, I vaguely recall a worskshop at a NN conference where the use of ANNs for GCM parameterisations were discussed, but it was a long time ago. Personally I think ANNs are not the best technology to use, far too easy to get wrong, and not very interpretable. Gaussian process models (another kind of GP) are rather better on both fronts (especially for handling/propagation of uncertainties), but not as interpretable as the models from Genetic Programming. I did some work on ANNs for downscaling myself some time ago, the novelty being to model the entire predictive distribution (e.g. ocurrence probability, and shape and scale parameter of a Gamma distribution for the amount, based on a nice paper by Peter Williams) rather than just the conditional mean (or some other measure of central tendency), which tends to under-represent extreme events. Worked quite well in comparison to other approaches. I’d like to get back to working on that at some point (particularly multi-site downscaling).
They calibrated the sine waves on the pre-1830 data, so the projection to 1830 is “real”, but not necessarily meaningful, due to “multiple degrees of researcher freedom”.
Incidentally, the model used in the Loehle and Scafetta paper (same as the one in the Loehle paper, IIRC) has the phase parameters for the sinusoids of 1998.58 +/- 1.3 year and 1999.65 +/- 1.3 year. One wonders how that could happen for a model fit to the data from 1850-1950 (second column of table 1). One of the sinusiods has a period of 20 years, so there would be two equally good optima of the cost function between 1950 and 1999 that an optimisation algorithm could have found from a starting point in the calibration data, was 1998 special in some way? ;o)
@-dikranmarsupial
Thanks for the video.
He mentions the methods are applied to climate models early in the talk; and sure enough, if you go searching There is a depth of published work over the last decade or so.
https://ora.ox.ac.uk/objects/uuid:849e6e43-bb4c-4038-9d26-39bc0559128e
http://onlinelibrary.wiley.com/doi/10.1029/2011JD016620/full
Not just the Annans.
Why does the paper which is the subject of this post rise above the surface and get the oxygen of publicity when there is an ocean of research extant, but acknowledged (or disparaged) only as part of the enormous rising volume of scientific work validating human understanding of the climate.
In the first line, Abbot is misspelled.
Thanks, fixed.
Figure 5 (South America)
1-s2.0-S2214242817300426-gr5_lrg.jpg
Figure 9 (Canadian Rockies)
1-s2.0-S2214242817300426-gr9_lrg.jpg
Ha Ha Ha!
Everett,
I missed that. Figure 5 and 9 are the same.
It’s in the PDF too, on the same page even (page 42). It looks like Figure 8 was not extracted correctly. But seriously just look at that page, how could anyone not see the similarity (their almost side by side).
That paper is a keeper (in terms of such an obvious graph duplication error)! 🙂
Wait, so they use zero forcing information in their machine learning?
Always wanted to see what kind of machine learning climate models people could come up with. But with zero forcing information, obviously the answer will be nonsense.
Which brings us back to ~3C per doubling.
Maybe somebody who understands how this is done can answer a question about how the ANN models were run on the test data (as opposed to the training data, which seems clear).
The spectral model produces some sine waves, call them w(t). The temperature data is at discrete times, x_i is the temperature at time t_i.
The neural is net is trained with inputs: w(t_i), x_i, output to predict is x_i+1
In other words: conditions at time t_i should predict temperature at time t_i+1. These are all independent cases, you train the network by repeatedly presenting it with cases and feeding back the error. You don’t even have to present the cases in time order.
After training,
— The test case inputs are w(t_i) and x_i again using real x_i data from the test period?
— Or you run the cases in time order, generating the x_i input from earlier x_i-1 case.
In either case, the error is measured by using real x values, right?
If the test case inputs are real x_i, then the accumulated error is going to be sort of bogus and probably underestimated. Because for each interval it starts with a real temperature, not the mistaken temperature it predicted from the earlier interval. But this is the accumulated error that they use for their claim about anthropogenic climate change.
But if you let the model generate its own input for the next interval, that’s not what it was trained to do in a certain sense. That’s what physics models do: compute conditions at t_i+1 from its previously computed conditions. This type of model just would just drift around making bigger and bigger mistakes.
Anyway that is my confusion. Probably somebody who reads a lot of experiments like this knows the answer?
> climate models people
I think you mean ClimateBall ™ people.
A “climate model” that doesnt give spatial predictions isnt a climate model.
Its a model of one lower metric of the climate.
Climate is more than temperature. duh.
If I were to write a spec for a climate model, I would say that it has to AT LEAST.
1. provide spatial estimates for relevant climate metrics.
2. Cover the relevant climate metrics: Whats the temperature, how much water and what phase.
spatial of course implies all 3 dimenisions.. so a model should at least provide
temperature at every 3D location and the amount of water ( in its various phases).
in fact dont call it a climate simulator. call it a water simulator
mboli,
The paper is using only the frequency domain. There are exactly zero time steps in any frequency domain approach. Summation of sine waves is a circle j…, err cyclic, so no need to worry, just like gravity … what goes up eventualy comes down.
BTW, why download digital data from …
https://www.ncdc.noaa.gov/paleo-search/
… when you can scan in the paper and digitize the scan! 😦
I think that I could randomly select 20-30 time series, half would trend down and half would trend up (outside the training window), pick the best six uptrends, write POS paper! 😉
Ben McMillan says: August 23, 2017 at 3:57 pm
“What is most surprising here is that these models appear to have any skill at all in predicting the post-1830 data.”
Dikran said the same thing.
“They calibrated the sine waves on the pre-1830 data, so the projection to 1830 is “real”, but not necessarily meaningful, due to “multiple degrees of researcher freedom”.
I must be reading things wrong so help out here. Just tell me it is in the full paper so I can chill.
Dikran is usually right on these things.
My understanding was 6 tree ring cores from approx 1000 to 1960.
I would have though that the analysis[calibration] would be done on the whole time period
But I am not a professor of tree trunks.
Then they put 6 single made up carefully reconstructions in from 1000 to 1830.
One reconstruction into each known site.
Right so far?
Then they ran the program, or used their ANN/Whatever. [As Marvin would say].
To calculate how their projection for 1830 -1960 [or 1965 or whatever] compared to their own known outcomes.
Right?
Slightly better if using the past to predict the future unknown.
My concerns are was the known future outcome available to the program and used by the program. Was it onlt form to 1830 data or the whole lot?
How many runs did they do and discard?
Why did they not do a Lucia and do a thousand runs of each?
Basically the further away from the actual observations the further away the projection can spread. The further it spreads the greater the climate Sensitivity. So you get the result you want by how differently you calibrate your trial runs to the true figures.
“The claim that climate sensitivity can be determined from this is really nonsense though.”
Funnily enough this is not so. If you have data determined by Climate Sensitivity then a mechanism from the data back to the climate sensitivity should exist.
The nonsense is that any meaningful climate sensitivity estimate could be arrived at.
For instance it could have been 3 with a different band of anomaly.
I can’t believe they used an ANN for this. ANNs excell at certain tasks where specifying the underlying mechanism is difficult and where you dont need to examine ‘how” it works.
So in image recognition you dont care how the network of functions works and you dont care if it actually maps to any underlying phycsical process you just care that it works.
So, take road sign identification. You dont care what the connections look like or if they map to the way the human brain works, you just care that it can learn how to identify a stop sign.
With a physical system like the climate we do care “how” it works. For example.
The real test of an ANN model of the climate would be to see what it predicted for MARS!,
or the moon. A physics based model, if complete, will work for other planets. An ANN ( I would wager) would not work. it doesnt discover physics. It discovers linkings and weighings which reproduce the curve. Its not a climate model, its an ‘earth global temperature” model.
pretty much useless as a tool or discovery mechanism.
I’ll bow to marsup since he has probably worked with ANNs more than me, I’ve just never had that many problems where an ANN looked like the right tool. ( i played with voice reco back in the early 2000s for Mp3 players )
mboli,
I’m not quite sure how the model in this paper is trained, but – yes – this type of model would just drift away, especially in a situation where we’re changing the system by adding CO2 into the atmosphere.
SM “I can’t believe they used an ANN for this.”
I suspect it was so they could get it published in a journal that probably wouldn’t have reviewers/action editors that had expertise in ANNs and hence would be intimidated by it. Machine learning practitioners should have ANNs in their toolbox as often the only way to find out what kind of model is best is to try them out and see how they do (although as I mentioned earlier this introduces additional degrees of experimenter freedom, so you don’t want to try too many things). Deep neural nets with convolutional layers seem very good for image recognition tasks at the moment, but I suspect only if you have a *lot* of data (my research is mostly in shallow models and small data, which is interesting from a statistical point of view, but currently unfasionable). Opaque models are still useful where you need to understand how the model works by giving a benchmark to gauge how much performance you are loosing by requiring a transparent, interpretable model.
@izen thanks for the links!
Ben wrote “The claim that climate sensitivity can be determined from this is really nonsense though.”
angech replied “Funnily enough this is not so.”
More hubris from angech, I wonder when he will learn that this approach is not doing him any favours and adopting a bit of humility might be an idea.
“If you have data determined by Climate Sensitivity then a mechanism from the data back to the climate sensitivity should exist.”
The error here, of course, is that the data are not determined solely by climate sensitivity, it is also determined by the forcings and the effects of internal climate variability. If you don’t look at the forcings then there is no way to disentangle climate sensitivity from internal climate variability due to omitted variable bias, as I pointed out earlier.
“Dikran is usually right on these things.”
pity that you didn’t pay attention to what I had already written then!
So, angech, do you accept that you were wrong and what Ben wrote was correct? Yes or no?
Objections galore, but their out-of-sample forecasts are actually pretty good.
Really?
Tol is some skeptic. Lol.
Richard Tol wrote “Objections galore, but their out-of-sample forecasts are actually pretty good.”
While they are out of sample, they are also post-hoc (i.e. the researchers knew what global temperatures look like in the test period and they also knew what the proxies looked like in the test period before they did the analysis). Regional and sub-regional proxies will be vary variable and hence it isn’t that surprising if some of them have reasonable out of sample predictions if you choose the right period for the test set and the right proxies. Too many degrees of researcher freedom, not enough evidence that the approach is robust.
The out-of-sample predictions are not even *that* good anyway, e.g. figs 7, 5/9, and 11, some of the wiggles sort of match up, but some don’t, and the human eye is a bit too good at seeing patterns and correlations where they don’t really exist. Ironically the only one that matters for estimating ECS (the northern hemisphere multiproxy), shown in fig 13, is the one with the least skill, where the model has only really predicted a long term trend from 1880-“2000” (actually 1845-1965, see above), but hasn’t really matched any of the cyclic variability (except perhaps for a highly attenuated dip at the end).
Since the anthropogenic signal only really became clearly evident after around 1970, a proxy that only goes up to 1965 is not the best means of estimating ECS! ;o)
I was surprised at how good they were, but that doesn’t mean I thought they were “good”, just better than I would expect (if there was no post-hoc cherry picking of proxies).
@dikran
That is an accusation of misconduct. If you have evidence, you should write to the editor of the journal.
I think it is worth reiterating that according to the ANN model, there was negligible cyclic behaviour of Northern Hemisphere (proxy) temperatures from 1880-present. The output of the ANN (orange line) is more or less a straight line until about “1985”, consider for a moment what that tells us about the underlying hypothesis behind the model!
> their out-of-sample forecasts are actually pretty good.
OK, there might have been some Gremlins, but the new data do not materially affect the results.
Something else that I’m not too sure about, just looking at the Northern Hemisphere proxy
It seems clear that the largest amplitude variability is at the largest scale, but the longest period component in the spectral composite has a period of 1227.6 years (seriously, in a datset with 1720 years of calibration data, how accurately can you estimate the period of an ~1000 year component?), but this explains only 2.7% of the power. Most of the power is in two closely space components at 428.1 and 422.3 years (34.2% and 39.3% respectively), so presumably the long term variability is due to the beat frequency between these two sine waves? Sadly the paper isn’t clear at all on how the analysis was actually performed (other than what tools were used).
Steven Mosher is right on. ANN gets the right answer without any exposition of how it arrived at the answer. Particularly when confounding data (which WOULD explain the results) is missing, it uses circumstantial evidence. That works for marketing, but it’s insufficient for a court of law or a scientific theory.
dikranmarsupial also got it right. Abbot and Marhosy didn’t provide enough information on method for anyone to reproduce the results, or even criticize the method.
@dikran
A rule of thumb is that you need at least 10 observations per parameter. 1720 years of data would thus allow you to observe a cycle of 172 years, but not longer.
I suspect the shifting of the time axis makes quite a big difference to the ECS calculation (however they actually did it) if they use the temperature increase from 1845-1965 but the increase in GHG concentrations from 1880-2000. I think that makes the estimate of ECS lower than it should be with the correct dates (if the method were viable in the first place).
BTW my last post was in response to that by Charles Sharpless, but it appears before Charles’ post on my browser. WordPress is not causally consistent?
Ah, they are back in the right order now, interesting!
“The authors also apparently distorted the temperature axes and shifted the date axis, omitting all the recent warming and essentially falsifying results.”
Never assign to malice what can adequately be explained by stupidity.
I am certain the “skeptics” are already calling for a retraction and a congressional investigation, while McIntyre has asked for all the data and code.
[sarcasm in the above may occur]
The authors also apparently distorted the temperature axes and shifted the date axis, omitting all the recent warming and essentially falsifying results.
”
I am certain the “skeptics” are already calling for a retraction and a congressional investigation, while McIntyre has asked for all the data and code.
[sarcasm in the above may occur]
”
Meanwhile, inquiring minds want to know:
Can we get a “Wow” or a “This paper will change the way you think about natural internal variability.” from Dr Judith Curry??
That was a good thread, Rev.
Meanwhile, I think we could dismiss the criticism galore with a simple the wording could be clearer.
Instead of bitching about someone else’s work, or allowing the ivory tower option of refusing to confront the problem until more evidence will have been gathered, or blaming defenseless gremlins, I say: be clearer-worded in the first place – use comic sans.
The lack of cyclic behaviour in the model output for the 20th century NH multiproxy also means that it directly contradicts the model of Loehle (and vice versa). I don’t think the paper mentions that ;o)
Let’s agree to disagree on John & Jenn, Rev, but not on Comic Sans.
Richard Tol commented on the length of the proxy record vs. the cyclic period which may be obtained. The mechanics of data collection for these proxies is that of a sampled data system, for example, sediment cores which are sliced into segments for which an average over the segment is measured. The date models are also based on average values for each segment, usually with some distance between the segments selected to construct the date model. This means that each measurement has uncertainty in dating as well as in the actual measurement. Abott and Marohasy, include no discussion about the error bars in their data sets.
For a sampled data system, the lowest frequency (longest period) which may be represented by a series of sine waves has a period of about half the length of the series, according to Shannon. Their selected proxy series show dominant periods when frequency analysis is applied but the periods found as shown in their tables include periods which are much too long. For example, the results for Southern South America in Table 2 has a length of 930 years, but there’s a strong peak at 745 years and Tasmania has a length of 830 years with a peak at 648 years. Both are clearly bogus results, the result of improper application of the mathematical tools. Mr. Tol’s “rule of thumb” may be too strict, but when sampling errors are considered, may be reasonable.
It’s not clear to me from reading A&M exactly which periods were used in their models, but it would appear that they selected the periods directly from the frequency analysis given in Tables 2 and 3. If so, the results of their curve fitting have no predictive value, IMHO. Not to mention all the other problems identified in the preceding comments…
Seems like the latest trend is to abandon any pretense of modeling actual physics in favor of physics by extrapolation. Nikolov and Zeller did the same.
@RichardTol I prefer a Bayesian approach, which suggests you can estimate whatever you like from the data you have to hand, but your uncertainty will depend on the amount of information contained in the data you have available, and your posterior distribution may be infinitely wide if there is none. Sometimes a broad uncertainty range is still sufficient to make the point you are trying to communicate and “rules-of-thumb” tend to be too coarse for that kind of judgement.
BTW I note you haven’t responded to my comment that while the predictions are out of sample they are still post-hoc, which means that can’t read much into them given the degrees of researcher freedom available. The out of sample predictions are also pretty poor for the only proxy that really matters w.r.t. ECS (NH multiproxy). However it is nice to see that you have responded to one of my comments, which suggests that at least WordPress isn’t hiding my comments from you, I was beginning to wonder! ;o)
Richard says “Objections galore,” which is essentially disregarding the objections without actually answering them, plus ca change…
At least it is a bit more subtle than Marohasy
who just rejects the Gavin Schmidt’s criticism about the proxy being incorrectly scaled and misaligned, as “false claims” (c.f. Donald Trump “fake news” defense via attack?), without any attempt to show that it is false before evading the criticism by questioning whether it is O.K. to splice instrumental and proxy records. Utterly transparent.
Nikolov & Zeller didn’t ignore physics, but they got it pretty badly wrong.
I believe, that an error is made here. According to the text in table 1 of Abbot&Marohasy the so called northern hemisphere composite actually is an Islandic reconstruction given as reference 33. Unfortunately the article is in the Journal of Paleolimnology, which my institution doesn’t have, so it would take a few days to get a look into the paper. But the reconstruction shows a few differences to Moberg et al 2005. Since Abbot&Marohasy didn’t use original data, but scanned the curves, I rather doubt, that they used Moberg et al 2005 for this. It is perfectly plausible, that the Islandic reconstruction resembles the northern hemispheric reconstruction, but with a higher amplitude. And since the Islandic reconstruction goes from 50 – 2000, this explains perhaps the perception of a shift, when they are interpreted as the Moberg et al 2005 reconstruction. From the method it doesn’t make sense to use Moberg et al 2005 – they use 6 perhaps well selected single sites, which give a desired result. Or which were just the most convenient to get and put into a scanner. Moberg et al 2005 (reference 66) is used only in one place – to show, that the temperature curve is bumpier than the hockey stick according to Mann et al. 1998, which is a never ending complaint of deniers. So the debate about a 35 year shift and expanded Moberg et al 2005 reconstruction is probably the wrong point of critic.
On the other hand, quite some possibility for critic was left out, when it comes to the allegation that the ECS is 0.6. This is taken from the assumption, that the residual temperature increase in the data from 1880 to the end after subtracting the temperature prediction from the neuronal network is 0.2 degrees C. But what they actually calculated was the mean deviation. If one looks at the table 12, the only relevant data are from Switzerland and from New Zealand. If you look at the Swiss data, the match between projection and temperature reconstruction is very poor. Now what difference does one want to calculate here? By coincidence the curves meet each other at the end (which is in 1950). Difference zero. If you calculate the trends (I didn’t do that, because I don’t have the data; they are not provided), I guess the projection has a trend twice as high as the reconstruction. That means, the anthropogenic signal would be negative – a ridiculous result. For the New Zealand data it is worse. They don’t show a trend difference between projection and reconstruction. But in the end, the projected temperature is higher. Again, the anthropogenic signal is negative. Greenhouse gases lower the temperature. It is complete nonsense. The trouble is, the average deviation has no meaning at all for the question about the amount of residual temperature rise in the data. Considered, the trend in the prediction is zero and in the reconstruction is x, the mean deviation is x/2. On the other hand, if both curves are trendless, but are noisy, they could easily have a mean deviation of x (whatever x is), but no real residual temperature rise. Therefore they can’t calculate ECS this way – they have used a random number, which by coincidence had the desired value.
Even sans-physics, things can go of badly wrong…
E.G. It is also not clear from the A&M article that the time-domain data was “windowed” – If it was not, then the frequency-domain results would contain spurious components (required to recover the discontinuities at the beginning and end of the time-domain data).
All of which brings us to the question:
How many clowns
will get out of this car?
@-” and is inappropriate that u graft instrumental thermometer HadCRUT record onto proxy record ”
Dogwhistle of Mike’s Nature trick and hokey sticks.
Is there ANY reason for this paper to consume so much bandwidth other than its claim ECS is below 1degC ?
There is already abundent evidence that makes such a value highly implausible before any consideration of its methodological deficiences.
Or Gremlins.
Jorg,
Yes, Gavin Schmidt has also pointed out (on Twitter) that reference 33 is not a NH multi-proxy study.
@-Jörg Zimmermann
” I believe, that an error is made here. According to the text in table 1 of Abbot&Marohasy the so called northern hemisphere composite actually is an Islandic reconstruction given as reference 33. But the reconstruction shows a few differences to Moberg et al 2005.”
If the differences are sufficient to alter the conclusions of this paper then the method is highly dependent on the error uncertainty of the data source.
If it generates a rising curve for the recent past whatever data it is fed then it is a method of deriving ‘hokey sticks’ from cyclic noise.
I don’t think you can get ice-ages, or emergence from them with a ECS below 1C.
[Playing the ref. – Willard]
Marohasy is right that Schmidt’s point about scaling is irrelevant: A neural network just works around that.
If it is inappropriate to splice NH instrumental thermometer records with the NH proxy reconstruction, doesn’t that imply that the proxy records are not a good indicator of actual NH temperatures (as you might measure with thermometers) and hence it would be unwise to conclude ECS was low based on a proxy that wasn’t representative of actual temperatures? I am assuming that some sensible form of “baselining” has been done first before the splicing.
If the NH proxy is actually a sub-regional (Icelandic) proxy as Jorg suggests, that would imply that any estimate of ECS (a global quantity) based on regional or sub-regional proxies are not very reliable, and therefore the findings of the paper are of negligible value even given the other technical problems being addressed?
izen wrote “Is there ANY reason for this paper to consume so much bandwidth other than its claim ECS is below 1degC ?”
It is a useful test of critical thinking (WUWT currently not faring too well).
Richard,
I missed those comments. I’ll have a look and see if they’re worth posting (probably not, but I might anyway).
How? This, as I understand it, is essentially the training data. Somehow a neural network can work out that the training data is incorrectly scaled and incorrectly aligned in time?
[No food fight, please. – Willard]
“Marohasy is right that Schmidt’s point about scaling is irrelevant: A neural network just works around that.”
Marohasy didn’t say that the scaling is irrelevant (or at least she hadn’t last time I looked), she just said it was a false claim. A neural net doesn’t “just work round that”, especially when it comes to working out ECS where it has to be properly alsigned and scaled relative to the atmospheric CO2 concentration data (as I pointed out upthread).
Chill, guys.
A blast from the past:
http://www.faqs.org/faqs/ai-faq/neural-nets/part2/
I would thread lightly on that one if I were you, Richie.
@wotts
If your predictions are off by a factor N, you multiply by a factor 1/N. If your predictions are early by M years, you add a lag of M years.
Willard, the offer for Richard to send his replies via email was genuine – if he wants do discuss the technical merits of the paper off-line I am happy to do so. I accept the criticism of the spot of teasing that went with it.
I must admit that I’m confused as to how the data was used to train the neural network. The paper says
As I understand it, before they even used the Machine learning algorithm, they had already determined a best fit to each data set.
FWIW the NN FAQ (which is very good) is mostly talking about input scaling and output scaling for classification models, but neither applies in this case as it is a regression model and it is the output that is incorrectly scaled. However that is ignoring the problem of the data being incorrectly aligned as well as incorrectly scaled, and while the neural network won’t be affected by the scaling in this particular case, what you conclude from the network output definitely will (as I explained upthread), so Prof. Tol’s defense of Marohasy is still invalid.
Richard,
If you claim that your out of sample goes to 2000, but your data is misaligned in time, then it doesn’t go to 2000.
> the offer for Richard to send his replies via email was genuine – if he wants do discuss the technical merits of the paper off-line I am happy to do so.
Sure, Dikran. In return, you need to understand that Richie’s baits need to have minimal decorum, and that if he’s moderated, so will be the responses to the moderated part.
Richie also haz a tweeter, and he can throw F words around all he wants over there.
ATTP as far as I can tell (the paper is rather vague) spectral analysis was performed to identify the sinusoidal components that explain the proxy in the calibration period. The spectral composite is just a linear combination of these sine waves (much like the model in the Loehle paper). The neural net appears to be used to replace the simple linear combination of sine waves with a non-linear combination of sine waves that is learned from the data (hopefully from the calibration period only). So essentially the inputs are sine waves and the model is trained to reproduce the proxy data as a function of those sine waves.
However, it isn’t quite as simple as that, as the software they used did some feature selection and feature construction (apparently by brute force search over many combinations to minimise the cross-validation error). However the constructed features can be things like “Minimum(Sine 6, Sine 2)” (Table 11), which vastly increases the complexity of the model class from which the final model is selected – which is a recipe for over-fitting the cross-validation error and getting an unreliable model. Difficult to think of a physical process in the climate system that could respond to the minimum of two different cyclic sources of internal variability!
BTW it isn’t at all clear what the %contribution means in table 11 for a non-linear combination of features.
Dikran,
Thanks, I’ll have to think about that a little.
Richard Tol wrote “If your predictions are off by a factor N, you multiply by a factor 1/N. If your predictions are early by M years, you add a lag of M years.”
Is there any indication in the paper that the scaling and lag had been taken into account in the calculation of ECS? No, not AFAICS. Is there any indication that A&M knew about an error in the scaling and alignment in order to take it into account? No, not AFAICS (if they were aware of it, why wouldn’t they just correct the data prior to the analysis, which would be a trivial step?).
By the way Richard, those were not rhetorical questions, if you want to carry on with this line of argument you need to show that the problem did not affect the calculation of ECS, so I do expect an answer.
Dikran,
Just to clarify something else. The way they’ve done this, it seems that the machine learning algorithm is used to try and replicate each dataset independently. Hence, they cannot use it for anything other than the 6 proxy sites/regions that they’ve considered in the paper. It seems completely non-general.
Willard “Sure, Dikran. In return, you need to understand that Richie’s baits need to have minimal decorum, and that if he’s moderated, so will be the responses to the moderated part.”
Indeed, the only point I was making was that there was a non-snarky part that I would prefer to stay. I ought to have more than merely minimal decorum myself.
> FWIW the NN FAQ (which is very good) is mostly talking about input scaling and output scaling for classification models, but neither applies in this case as it is a regression model and it is the output that is incorrectly scaled.
My intent was to introduce the more general concept of standardization, and to emphasize that it may not be possible to standardize without eliminating information. The rest of the quote illustrates both the convention and the fact.
Here would be AndyG’s take home on standardization:
http://andrewgelman.com/2009/07/11/when_to_standar/
Richie’s claim’s a bit over-the-top, of course.
ATTP indeed, which is why I thought only the NH multiproxy (if that actually is what it is) was the only one that was worth looking at, and ironically the one with the worst out-of-sample predictions.
The real problem with this kind of approach is that someone looks at a plot of the data (including the 20th century) and thinks “mmm, looks like there is some cyclical behaviour to me”. This means that even if you fitted a model to the data up until (say) 1880 it is bound to give reasonable “predictions” for the data after 1880 because if the data after 1880 didn’t broadly follow the apparent cyclical behaviour of the data prior to that, you wouldn’t have thought to fit a cyclical model in the first place as the post 1880 observations contradict the hypothesis. Hence out-of-sample but post-hoc means “caveat emptor” in capital letters, bold font and underlines twice with wiggly lines.
Well, that’s interesting:
http://www.neurosolutions.com/infinity/
”
What Our Customers Are Saying
”
Recently, Dr. Abbot moved his research to NeuroDimensions’ next-generation predictive data analytics software called NeuroSolutions Infinity. With it, he improved his previous results in NeuroSolutions by 10% (based on the MAE and RMSE) and 24% compared to the other neural network software he tried.
These machine learning algorithms go to eleven…
Anyone,
The paper is now behind a paywall. The version that now downloads (via *******) is the “Accepted Manuscript”
Did anyone save a PDF copy of the paper that was available just yesterday?
If so, I want a copy of that version if you don’t mind (with the page 42 Figure 5/Figure 9 cut and paste graph error). I forgot to save a PDF copy myself yesterday! 😦
TIA,
gmail everettfsargent
Lookig at this paper, it does not really uses Machine Learning or modern neural nets at all.
Modern Neural Nets have several hidden layers that are there to extract (=infer) features from the input data in an iterative process. Back propagation is the most common method these days – changes to nodes in layer j+1 will trigger changes in layers j, j-1 etc. That process is slow, but not too slow for todays CPU’s and GPU’s.
Howver, the Neural Net used in this paper is a GRNN – generalized regression Neural Net (1991, Specht) from the days that back propagation was not feasible for even modestly complex problems. This GRNN uses lazy evalation and a one pass algorithm, which in plain English means that it does not learn at all. It is best described as a multi-dimensional smoother. The only tuning parameter is the smoothing factor (called sigma). I have never seen anyboduy in the Machine Learning Community even mention GRNN, we all use a gradient booster for this kind of regression (such as xgboost) or full fledged deep learning networks.
The paper mentions that a commercial package was used for this GRNN, but both Python and R also have implementations. I tested the R version which confirms the following.
As with all generalized regressions, it has the problem that it cannot predict outside the interval [y_min, y_max] of observed values. That means it cannot forecast non stationary time series, cannot see higher frequencies than in the input, etc. It is unable to look outside the box.
So, GRNN does not validate its learning because it does not learn. Everything it predicts has to be evaluated in another way, or has to be taken with many grams of salt.
PS Everett: https://www.dropbox.com/s/ckn7c2pmqmli4q7/1-s2.0-S2214242817300426-main.pdf?dl=0
The data is only misaligned, when the temperature record shown in figure 12 + 13 is from Moberg et al 2005 (ref. 66 in Abbot&Marohasy). It is not misaligned, though, when it is from Geirsdóttir et al 2009 (ref 33), and I think this is the case. On the one hand the paper of Abbot&Marohasy is too ridiculous to deserve attention. On the other hand, if it is critisized, it should be done right. All the talk about fourier transformation of the temperature reconstructions and the neural network to get an estimate of a “natural” temperature development is smoke and mirrors. The real point is, that they calculate TCS from a random number and call it ECS. And that the curve fitting doesn’t start with correct data, but with scans from papers, which already introduces errors. And finally, that the 6 locations of the temperature proxies are not representative for the globe and exclude most of the anthropogenic warming, which occured after 1970.
I do worry about software packages designed to automatically optimise powerful statistical/machine learning methods for users with little experience or even in some case understanding of the underlying algorithms and their pitfalls. If someone is liable to shoot themselves in the foot, it is probably a bad idea to give them a railgun. Try regularised linear (i.e. ridge-) regression first, with no feature selection/construction and use that as a baseline, you may still shoot yourself in the foot, but you are less likely to do so, and will only have made a small hole in it.
Everett F Sargent:
Top of page:
http://www.sciencedirect.com/science/article/pii/S2214242817300426
Jorg,
Except the latter paper is simply reproducing Moberg et al. (2005), so I don’t see how this can be the case. I’ve also just looked at Geirsdottir et al. and it does still seem misaligned (their Figure 13 e).
mrooijer I think you are being a bit harsh about GRNN, there are plenty of old neural network algorithms that are perfectly usable (and indeed tend to get reinvented from time to time). Human beings do one-shot learning as well (I suspect most people would only snort wasabi once, before forming an internal representation of that being something not to be done again under any circumstances), so that sort of incremental constructive learning is still learning. Similarly kernel methods ( support vector machines and Gaussian processes etc.) are closely related to splines, so they could also be viewed as multi-dimensional smoothers that don’t really learn, per se. I’d be interested to see some applications of deep learning where there wasn’t a large amount of data available, shallow learners still have their uses for small data problems, the difficulties of which are far from solved.
Nevermind,
Go here …
http://jennifermarohasy.com/2017/08/recent-warming-natural/
Or not! But link there to open access is here …
https://authors.elsevier.com/a/1VXfK7tTUKabVA
http://www.sciencedirect.com/science/article/pii/S2214242817300426
http://ac.els-cdn.com/S2214242817300426/1-s2.0-S2214242817300426-main.pdf?_tid=dd6e6362-8903-11e7-9cfe-00000aab0f27&acdnat=1503603715_62e6d428ce8f3e93e2619e4036abb359
Like I said before … keeper. 😉
Willard that’s the Accepted Manuscript version (but will keep that one also). thx.
The Rev,
Yes, it works now, but just before my plea it went to a paywall, now it works (went to WTFUWT for JM link, then JM blog, found link at JM blog, that worked also).
Keeper! 🙂
From the OP:
“If you don’t consider the underlying physics, then you essentially know nothing about what’s causing the climate to change/vary.”
Which pretty much invalidates almost all of Steve McIntyre’s statistical gymnastics. Climateball™: it’s how you play the game.
Jörg Zimmermann says:
August 24, 2017 at 7:15 pm
” And finally, that the 6 locations of the temperature proxies are not representative for the globe and exclude most of the anthropogenic warming, which occured after 1970.”
A common theme here characterized by any number of graphs tacking on modern instrumental records.
While it is fine to add them on the simple fact is that the data is not easily available for after 1970 and may never be available. Trees sampled in 1965? may no longer exist [particularly if doing tree rings].
Comment is best on the facts as they are not complaining about what cannot be.
Further 6 locations not representative of the globe?
6 is a lot better than one. One site is the bare minimum and if used long enough and accurately enough with modern science could well be, what’s that word for what Zwke does?
Adjusted to give a true picture for the world.
Well true for that one data site.
Addendum, Nick Stokes had?has an article a few years ago about doing a world temp with just a few sites. Was it 3 or 20?
Mosher can give the expected temp based on location, altitude and time of year for amywhere in the world from the data he has available. We already have potential/real available world temperatures.
For those who have not yet seen Geirsdóttir (2009) figure 13:
If we go back to Abbot and Marohasy, they describe the proxy as a temperature reconstruction, Northern Hemisphere composite (pollen, lake sediments, stalagmite, boreholes), date range 50 – 2000 AD. Geirsdóttir et al (2009) is NOT a temperature reconstruction, is from one lake in Iceland, is only lake sediment (TOC & BSi), and date range 0 – 2000 AD. Incorrect on every count.
Moberg et al (2005) would seem to be the correct citation.
Do Moberg and the Icelandic lake match up ??
Dear Richard Tol,
“Objections galore, but their out-of-sample forecasts are actually pretty good.”
Yes, we all would like to see there so called “out-of-sample forecasts” with some p-values and R^2 even (absent autocorrelation).
Like say past 1950AD (Trachsel (2010) only goes to 1945AD, Cook (2006) only goes to 1957AD (“Figure 6 Oroko Swamp silver pine January–March temperature reconstruction. This series show evidence for Medieval Warm Period-like epochs of above-average temperatures in the 12th and 13th centuries, followed by cooler Little Ice Age-like interval. Note that the temperatures after 1957 are scaled instrumental data) (NZ), Figure 2 for Tasmania goes to 2001AD. BTW, who ran the 50-year (lpf) extremely dull Sharpie(TM) through those figures (kind of makes it hard to digitize properly)?) or 1980AD (Moberg (2005) only goes to 1979AD) or 1990AD or 2000AD (Luckman (2005) only goes to 1994AD, Neukom (2010) only goes to 1995AD).
Heck how about 2017AD or even 2100AD??? Because, no data??? How about the CMIP5 AOGCM’s used actual data to 2005AD and make projections to 2100AD.
IF AM (2017) has any forcasting skill whatsoever, then let’s see their projections to 2100AD and compare those to the CMIP5 2100AD projections.
Note to self: I can find the digital data for all papers except Cook (2006), oops just found Tasmania and NZ …
https://www1.ncdc.noaa.gov/pub/data/paleo/reconstructions/pages2k/australasia2k/Mt_Read.txt
https://www1.ncdc.noaa.gov/pub/data/paleo/reconstructions/pages2k/australasia2k/Oroko.txt
I’m six for six finding the digital data!
Now to plot them up in Excel, print them out with my HP AIO, then scan them back in with my HP AIO, then use UN-SCAN-IT software to digitize all six. Obviously, that’s the only way to “reproduce” their [Snip. – Willard] 1950’s analog merhodology.
hey marsup, if you like ANN I can probably get you samples when we finish
https://canaan.io/2017/05/21/canaan-raises-43-million-in-investment-round/
izen – I’m not sure your justification for using the BSI/TOC ratio as the appropriate comparison. TOC is typically used to infer erosion. While Biogenic Silica accumulation is used to infer air temperature. So I think panel c. is probably the better comparison to Moberg 2005.
caveat – IANAL; where ‘L’ equal Limnologist 🙂
One of the few papers:
Application of artificial neural networks in global climate change and ecological research
https://link.springer.com/article/10.1007/s11434-010-4183-3
Marohasy has also worked on regional forecasts. ANNs are being studied for regional forecasts.
Modeling Emergent Behavior for Systems-of-Systems
http://onlinelibrary.wiley.com/doi/10.1002/j.2334-5837.2007.tb02985.x/abstract
With the emergent behavior of a physical system such as the climate, ANNs may be useful.
I finally have the answer:
“Pirsig wrote “Zen and the Art of Motorcycle Maintenance” (1974) that contrasts something very Holistic, Zen Buddhism, with something very Reductionist, Motorcycle Maintenance. So the chasm between the strategies was identified a long time ago.”
https://artificial-understanding.com/two-dirty-words-6703aee8e323
The immediate above talks about the limits of reductionist science. Don’t let the use of ‘holistic’ throw you please. You aren’t naturally STEM. You had to learn to be that way. Perhaps we call science a discipline.
Eli gives the authors too little credit in noting “They even have their own journal. ”
They have two — Quadrant pays them both handsomely as McKie Family Senior Fellows.
Ragnaar –
From your link.
Humans each solve thousands of little problems every day, and we are solving almost all these problems Holistically, using Understanding, and without a need to Reason at all. This includes fluent language use.
Two questions. First, is that statement holistic or reductionist?
Second, how do we know that fluent language use requires no reasoning and solves (language) problems holistically? Is there an assumption that reasoning must be conscious and volitional?
Holistic systems are “Model Free”. They do not use any a priori Models of any problem domain
Seems to me that fluent use of language is very much (subconsciously) model based.
“With the emergent behavior of a physical system such as the climate, ANNs may be useful.”
you cant rule out unicorns being useful.
The test of whether or not an ANN is useful is actually specifying an important use and then demonstrating that it is useful.
For “climate” which is INHERENTLY SPATIAL ( how’s the climate in your part of the world) a ANN model should.
A) produce spatial results ( x,y and z) cause mountain climate and valley climate.
B) Capture the important physical states of water. Not just the temperature of air.
C) Anser simple ‘what if’ questions: what does your model predict if Solar forcing goes to zero.
however, if the intended “use” is just fooling yourself, ya an ANN will work for long range climate prediction
Havn’t yet got a copy of Geirsdóttir et al 2009, but according to what I see here and other comments elsewhere, I seem to have got it wrong – Abbot and Marohasy messed up beyond every possibility. Fig. 12 really better matches Moberg et al 2005 and completely not Geirsdóttir et al 2009, misaligned and distorted, though. Thank you for the correction.
angech wrote “A common theme here characterized by any number of graphs tacking on modern instrumental records. While it is fine to add them on the simple fact is that the data is not easily available for after 1970 and may never be available. Trees sampled in 1965? may no longer exist [particularly if doing tree rings].”
Angech, perhaps you ought to find out a bit about how dendrochronology actually works before making such statements.
Ragnaar thanks for the links, much appreciated.
SM More into kernel methods and Gaussian processes these days, but am running a bunch of neural networks on the HPC facility here at the moment, mostly as a baseline for something else.
From Ragnaar’s link
“Holism is the Avoidance of Models”
“Neural Networks are Holistic”
err, no, neural networks are models. You don’t necessarily know how the model works, but they are still models.
Going straight from problems (inputs) to solutions (predictions) without an explicit model (that can be used to generate predictions elsewhere in the domain) is known as “transductive learning” in machine learning.
I think the article probably works better if translated into less loaded terms (e.g. “logical” for “redictionist” and “intuitive” for “holistic” forms of reasoning). The way we think (IMHO) is largely down to having a rational (logic based) reasoning system implemented in neural network (pattern matching) hardware (c.f. Kahneman’s fast and slow thinking processes), but that is nothing to do with reductionist versus holistic views, AFAICS.
“A lot of people coming from a STEM background cannot even imagine how to solve problems without using Models. … The Holistic answer is a quick guess at the best action, based on experience with similar situations. “
Unfortunately you can’t generalise from similar situations (which are sometimes not all that similar) without some form of internal representation (model) which encodes the key features/commonalities of the situations, which is a model (transductive models are still models).
“Some people claim they use Reasoning while making breakfast… but they can make their breakfast while speaking to someone else on the phone and as they hang up, they find themselves suddenly sitting at the breakfast table with their coffee and hot oatmeal.”
just because you are not conscious of reasoning, doesn’t mean your subconscsious is not reasoning. Note we don’t hold memories of each occasion where we made breakfast and just pick a memory to follow, we have a concept of making breakfast, which distills the key components of the related experiences, and that is a model of making breakfast.
I’ve often thought of the redictionist-v-holistic thing a bit odd, they are just the flip sides of the same coin. When I was at school (a very long time ago now) an external speaker came in to give a talk about existentialism, but spent most of his time talking about how the tree of life (phylogenicic trees, cladograms etc.) is bad because is was reductionist instead of taking a holistic view of life, but that depends on which direction you traverse the tree. If it is from leaves to roots, you are looking at the connections between things that makes the whole from the parts. I would have thought most sciences involve a bit of both as a matter of course?
What a physics based reductionist model can do
“A separate study by Stanford University and French scientists found that it snows on Mars at night. Computer modelling of the climate suggests that ‘microbusts’ of snowfall can occur on the red planet due to the cooling of cloud water ice particles during the night.
Scientists now believe that turbulent storms, which can only form at night, act to vigorously mix the atmosphere and, in some places, deposit snow on the Martian surface.
The proposed process also sheds light on the previously unexplained precipitation signatures detected by NASA’s Phoenix lander. The research was published in Nature Geoscience. “
The pushback against Mike’s trick has always bothered Eli. Since proxys are not calibrated against unicorns but against some portion of an instrumental record they damn well better agree over the calibration and the validation regions. Extrapolating that further out makes sense.
More to the point, teh Bunny has always been bothered by the calibration region for most of the proxys being at earlier times because as a general rule you want to calibrate where your standard is changing most rapidly and that is the more recent side of the instrumental record (although the cut off for the proxy will limit this). Even that would raise issues bcs if the validation region is flat in changing temperatures this reduces to a single point validation.
To continue what was said over at Rabett Run (you know the address) Eli has always wondered why papers don’t explicitly say whether each proxy was calibrated against nearby instrumental data or global data. Suspecting the latter leaves great room for skepticism of the real kind.
Since A & M apparently used data from Moberg et al. (2005) as their “Northern Hemisphere” proxy data, inspection of Moberg shows that their work combined 11 low resolution proxy series with high resolution tree ring series. In Moberg Table 1, we find that 6 of the series use 14C dating with dating errors given as between +/- 100 and +/- 200 years. Without going back to the primary references, it’s worth noting that the 14C dates for ocean sediment must include a reservoir correction, which means that a constant is added to the lab measured dates from which the date model is constructed. The date model typically assigns dates to each sample using linear interpolation along the core between samples picked for 14C measurements. This assumes a constant rate of sediment deposition, whereas temperature variation may impact the biological productivity of the waters which produce the sediment. Of course, the 14C dates after 1950 may not be valid due to contamination from the atmospheric testing of nukes.
Table 1 also provides the sample resolution for each set of data, which indicates ranges from 50 to 150 years for the Sargasso Sea core, 30 to 160 years for the NE Caribbean Sea core and 15 to 180 for the Arabian Sea core. The Shannon/Nyquist theorem says that these sample resolutions must be doubled to indicate the shortest period of a sine wave which may be fitted to these data. Combining these low resolution series with the higher ones may introduce aliasing and other distortions into the resulting series. A & M’s use of the Moberg et al. series may be extracting false frequency data as a result. Their neural network processing can not correct for dating distortions included within the original data.
Stoopid modulz:
For the criticisms of the study, technically most of you might as well be speaking in Russian which I don’t understand. Where I was trying to go, and I often end up on a tangent, was the marrying of reductionist science with something else to deal with the emergent behavior. For some time my view of the climate has been a network. In a simple form, temperature data from measuring stations at time intervals, the more the better. The weather more or less moves from the area of one station to another as in a network. Learning the average relationship between nearby stations and specific scenario relations (such as a line of thunderstorms passing through) would not require understanding climate. Please do not take my prior remark about STEM the wrong way. My son with little guidance from myself has landed at the U of Maine as a TA / RA grad student and I am told, working on his PHD in physics. Me >> blind squirrel/acorn.
Joshua:
A child learns to walk. A machine is programmed to walk and for good measure, run the 110 meter hurdles. I agree, saying something is not a model can be difficult. Not everything is a clock which we can take apart, understand and duplicate. Some things are fuzzy.
Steven Mosher:
Assume we have two weather stations, one in a valley and one at a nearby ski resort on a mountain, each with hourly readings. The ANN would learn of their relationship. Then disconnect the mountain station for a month and predict its temperature during that month. Then see how it did? Then add two more stations nearby. Practice, refine. Agree about water, and combining different attributes. Your point C). I don’t have an answer for that.
Dikranmarsupial:
Yes neural networks are models. A contrast approach was used. There article could’ve been written with a softer approach. I was attempting to find something about the limits of reductionist science. There it the grid size situation with the CMIPs and the needed computer horsepower issue.
Steven Mosher:
Mars study. Point taken. The market for Mars studies. The market for knowing the GMST and SLR. The product we sell should be…
Steven Mosher:
Assume we have two weather stations, one in a valley and one at a nearby ski resort on a mountain, each with hourly readings. The ANN would learn of their relationship. Then disconnect the mountain station for a month and predict its temperature during that month. Then see how it did? Then add two more stations nearby. Practice, refine. Agree about water, and combining different attributes. Your point C). I don’t have an answer for that.”
yes. You could learn about the relationship between temperature. We already do that.
The point is this. You havent learned WHY its cooler in mountain. and warmer in the valley
and then when you have cold air drainage.. that relationship gets upset. and it you try to transfer what that ANN learned about 2 sites ( it didnt learn about mountains and valeys) to Other moutains and valleys with different heights and morphology it will fail. A physics based approach
is going to give you a better and more understandable answer if you move to novel out of sample
areas
Steven Mosher:
Mars study. Point taken. The market for Mars studies. The market for knowing the GMST and SLR. The product we sell should be…
SLR? well an ANN given the current SLR will not learn about the potential for acceleration.
We expect to see acceleration in the next couple of decades because PHYSICS. Not because history. not because old data. but because physics. We look over the past couple of hudred years and we all can see the roughly linear rise that an ANN would “learn” . What it wont learn
is what we need to know.. The physics of heating the ocean tells us that we should expect over time an acceleration. It would be stupid to try to predict a future using a statistical model that didnt understand this. If I feed it roughly linear data it isnt going to learn about physics.
For GMST. The science tells us the warming will not be uniform. To make regional plans for adaptation we need spatial prediction. Not a global mean.
Reductionist and Holistic approaches are not independent, qualitatively different ways of thinking. They are complimentary aspects of a unified process of how we come to understand the external world.
Reductionist and Holistic thinking are not (except as a loose metaphor) synonyms for rational and intuitive modes of thought. Or conscious and unconscious processes. I suspect we ascribe intuitive knowledge to cerebral neural nets that have ‘learnt’ a task but we are unconscious of the ‘middle layer’ rules of the net that has emerged. Often it seems to emerge where a complex, multi-factored system has to be dealt with in real-time. (fast/slow thinking?)
Holistic and Reductionist analysis are indivisible aspects of a single process involved in understanding.
To reduce some aspect of the world down to its parts, or important factors makes the implicit supposition that there is a whole that is made up of smaller parts. Reductionism only makes sense if it assumes that any system of interest is formed from smaller parts. (except quarks?)
A Holistic approach is also constrained by the same assumption that all the parts that we can understand, combine in complex ways to make larger systems of interest. (except the universe?)
Top Down, or Bottom Up might be closer synonyms of holistic and reductionist. Both implies a continuum of explanatory levels or categories, the terms are adjectives describing in which direction you are looking at the problem.
” Eli has always wondered why papers don’t explicitly say whether each proxy was calibrated against nearby instrumental data or global data. ”
Yes. Long ago when I used to open my mouth about these things I wondered as well.
never got a clear answer.
ANNs are an engineers dream, physics free answers that can be trusted in sample. Outside. . well not so much and like any beast the answers depend on the feed.
ER,
I’m trying to reproduce some of their results (Swiss Alpine) …
Multi-archive summer temperature reconstruction for the European Alps, AD 1053-1996
Click to access 2012-trachsel-et-al-millennium_alpine_synthesis-qsr.pdf
The early instrumental warm-bias: a solution for long central European temperature series 1760–2007
Click to access 54d4bb860cf25013d029d202.pdf
http://www.zamg.ac.at/histalp/dataset/grid/crsm.php
Some BEST data for Switzerland …
http://berkeleyearth.lbl.gov/auto/Regional/TAVG/Text/switzerland-TAVG-Trend.txt
And some CMIP5 global data …
https://climexp.knmi.nl/CMIP5/Tglobal/index.cgi?email=someone@somewhere
I need to get to 2053AD somehow as the Neurosolutions Infinity trial software seems to want 1000-years worth. I have no idea what I’m doing with the Neurosolutions Infinity, but so far, no set of periods/frequencies, AFAIK no way to specify/constrain periods/frequency, the software needs all data though for one proxy time series (1181-1945AD, 1181-1830AD training and 1831-1945AD validation).
Garbage output has R^2~ONE though! It’s magical software.
AutoSignal is a real weiner too! Dates to circa 2009AD, select the FFT option get an answer, push the NL button get a different answer, select the Loam option get a different option.
But the real question is … wait for it … Who goes into this assuming a stationary process?
https://en.wikipedia.org/wiki/Stationary_process
https://en.wikipedia.org/wiki/Cyclostationary_process
“Top Down, or Bottom Up might be closer synonyms of holistic and reductionist. Both implies a continuum of explanatory levels or categories, the terms are adjectives describing in which direction you are looking at the problem.”
Top: Bottom.
reductionist way of thinking about thinking.
see the box you are IN
you have to think OUTSIDE the box
opps I just thought about thinking in a reductionist way.
the metaphors about thinking are fun.
personally I see physical models (where the objects you quantify over are posited as physical)
and statistical models where there is no necessary ontological commitment.
A physical model will depend on variables that map to the physical world
A statistical model ( say an ANN) will have entities (layers and connections) that
are suffient to explain the observables, but not necessarily mapable to physical reality.
in the end what works, works.
There are a select few who always go around in circles. Seems to be a German specialty amongst the aged that Australian denialists are very fond of Eli has been following this stuff too long.
Look to coral proxy data for calibration. It picks up ENSO variations very sensitively, so calibration is from 1880 to present using NINO34 or SOI. It’s also insensitive to rainfall.
Calibration of what?
Peak to peak excursions of ENSO. Compare that to trend. Standard signal processing approach.
Steven Mosher:
Can’t disagree with your reply. Your many attempts to explain how BEST does it, how others do it, has me on a new path related to ANNs.
Good Ol’ Web. Mind your sock, please.
On the topic of climate A.I. this attempt at applying frequency analysis to proxy temperature reconstructions is not what I think of when I think of a A.I. climate system. There should be multiple expert domains in the system like physics, biology, chemistry with biology the most relevant to humans. I think we will have better results by building a knowledge base with neural networks (a brain) while machine learning is biased to data acquisition and pattern recognition and should be thought of as a nervous system supplying observational and empirical information. We should expect answers to questions we didn’t ask like pandemics(bad) and new emergent societies(good?).
Elon Musk making a move on brain/computer interface.
https://techcrunch.com/2017/08/25/elon-musks-brain-interface-startup-neuralink-files-27m-fundraise/
Question: Is reading or writing to the brain more important? If we could change a memory should we? Total recall or brain washing?
A very interesting presentation by Kevin Kelly from last month,
http://qideas.org/videos/christianity-in-1000-years-1/
Kelly might be a Greek philosopher reincarnated. Always insightful.
http://kk.org/interviews
Climate A.I. cont,
Application of Deep Convolutional Neural Networks for Detecting
Extreme Weather in Climate Datasets
Click to access ABD6152.pdf
This would be an example of pattern recognition built on top of a neural network.
“This study presents an application of Deep Learning techniques as alternative methodology for
climate extreme events detection. Deep neural networks are able to learn high-level representations of a broad class of patterns from labeled data. In this work, we developed deep
Convolutional Neural Network (CNN) classification system and demonstrated the usefulness of Deep Learning technique for tackling climate pattern detection problems. Coupled
with Bayesian based hyper-parameter optimization scheme, our deep CNN system achieves 89%-99% of accuracy in detecting extreme events (Tropical Cyclones, Atmospheric Rivers and Weather Fronts).”
Here would be my Climate Turing Test – to automagically generate a thread like this one:
https://judithcurry.com/2017/08/20/reviewing-the-climate-science-special-report/
(responses and all) and lure ClimateBall players into attributing intelligence to the collective.
A Twitter bot might be easier at first.
I like the idea of a skepbot. I’ve talked about it for years but never lifted a finger to create one. The matrix really helps toward this goal. And then you think maybe I could build a AGWbot and the two would just battled neverendingly, primarily because there is no halting mechanism.
Think about that and the halting problem.
You cant end an argument with words. Hell you cant even prove a program will halt
( hence we have to use “gas” in Etherium to halt smart contracts that run too long)
What settles arguments? natures bat. or as Pekham argued in power and explanation.. Power
puts an end to the infinite regress of argumentation. Raw state power in the end.
bleak I know.
Climateball needs CNt-Alt-Del.
or blog wars need “gas” Long ago on CA I toyed around with this idea, but didnt know how you could do it. Essentially you have to create a currency system where people are rewarded for producing value in conversation and where they have to pay a transaction fee to comment.
such discussions would be boring and fruitful. perhaps low traffic.. with Steemit we will see.
peak to peak enso measured in what units. If you say K (or oC) you are calibrating to a temperature profile. Same issues.
Slightly off topic.
I have commented on Marohasy’s and Abbot’s competence using neural nets and the inadequacy of peer review of their work here at http://jennifermarohasy.com/2017/07/skilful-monthly-rainfall-forecasts/#comment-587036.
ENSO reduces to a modified tidal analysis, so that will be the starting point. Presenting at the AGU and will be published in a Wiley/AGU imprint book next year. Yeah!
@-geoenergymath says:
“Look to coral proxy data for calibration. It picks up ENSO variations very sensitively, so calibration is from 1880 to present using NINO34 or SOI. It’s also insensitive to rainfall.”
To some extent this is already done;
Click to access Stephans_etal_GRL_2004.pdf
Depending on the specifics of the local conditions, coral can be more sensitive to sea surface salinity than temperature. If your coral was near a river outlet it is very sensitive to rainfall.
Coral may give you the date and relative magnitude of the past ENSO cycle, but not the absolute global temperature.
http://www.sciencedirect.com/science/article/pii/S0031018213002757
Why have ANNs failed to give us the dates and magnitudes of the next ten El Nino events? We have a thousand years of proxy records to train them on.
Claims the sequence of climate states is a product of internal cycles, or can be analysed as such with a firm prediction of future trends, might be more convincing if they also provided the future of the major cycle that does have a significant effect on climate.
Eli wrote “ANNs are an engineers dream, physics free answers that can be trusted in sample.”
I don’t think many engineers even trust them in-sample ;o) Methods that give error bars that widen as you extrapolate from the data give a bit of additional confidence [sic], this one is a Gaussian process
but you can do similar things with a neural network instead. Note when there isn’t very much data, the in-sample (and out-of-sample) error bars are unreasonable. This is because there are two ways of “explaining” the data – a simple function with lots of noise, or a wiggly function with low noise, so whether we have confidence in-sample is conditional.
Again, grNN is not a neural net. It is a sim[le smoothing disguised as NN. The algorithm just takes a couple of lines of matrix algebra. Its a weighted average of the know target values, where the weights are a disctance fucntion of the observations. Nothing special and it performs very badly compared to a modern generalized regression. For the fun of it I tested it on the Boston Housing Benchmark, xgboost had a rmse of 3, grnn of 6 and the just-guessing -he-mean-algotithm has a rmse of 9. https://mrooijer.wordpress.com/2017/08/27/what-is-a-grnn/
Jan van Rongen GRNN stands for General Regression Neural Network and was published in the IEEE Transactions on Neural Networks, which suggests it is not unreasonable to call it a neural network. Radial basis function neural networks (one of the most common types) are also essentially simple smoothers if you want to take that view (closely related to splines). It may be nothing special, and it may not work well on a particular dataset compared to more modern algorithms (it was published in 1991), but that doesn’t mean that it is not a neural network, or that it is not a reasonable thing to try. The “no free lunch” theorems suggest there is no a-priori superiority of any machine learning algorithm over another, and often the only way to find out which algorithms work best is to try them and see, so it could be that this happens to be a task that match the inductive biases of GRNN, without replicating the study we just don’t know. There are plenty of things wrong with the Abbot and Marahosy paper, but let’s keep things in perspective. Note that according to the paper, the software* also screened several more common/recent machine learning algorithms before choosing the GRNN, which include standard MLPs and the support vector machine (which is definitely a pukka machine learning algorithm if ever there were one).
* it could however be that the software is not very good and the screening procedure was flawed (e.g. using un-optimised default parameter settings)
ENSO appears to be a combination of frequency modulated and amplitude modulated factors. The one drawback of coral proxy records is that they are integrated yearly so that some of the fine detail that modern instrument-era monthly records reveal is missing.
From the Grauniad:
I think I would be happier if the editor was a little more concerned about this. He selected two independent referees that were unable to see any of the glaring flaws in this paper, and the failure of the peer review process is just as much his responsibility as it is the reviewers.
The journal is one of those that asks the author to suggest the names of suitable referees, so I hope “independent” means “not suggested by the authors”. Decent journals really should never [have to] do that, it is a recipe for pal-review. If the action editors don’t know the field well enough to identify competent referees for a paper then either they are not experienced enough to act as action editors or the paper is outside the remit of the journal. This seems to be a common factor in journals that have published bad climate skeptic papers.
dikran,
I saw that comment from the editor. I’m tempted to email and ask why he didn’t scrutinise such a potentially paradigm shifting paper in more detail.
Indeed, I think I’d ask whether either of the referees were suggested by the authors (not who they are, just whether they are completely independent).
Judging by his Google scholar profile and his departmental web page he is perhaps at too early a stage in his career to be an action editor for a journal (IMHO) and GeoResJ seems to be his first editorial appointment so perhaps some of this is attributable to inexperience and lack of support.
I’ve never been the editor of a journal, but I have edited a special issue of a journal and finding suitable referees for papers requires a broad knowledge of the field as the paper may not be that close a match to your own particular research interests, but I suspect it gets easier the longer you work in a field and go to conferences (where you see a broader spectrum of research) etc.
Something doesn’t quite add up…
The journal editor who handled the Abbot paper, Dr Vasile Ersek, of Northumbria University in the UK, said he was “sorry to see it involved in a controversy”
Seems to me that anyone even remotely familiar with the science in play would have to have anticipated that the paper would be enormously controversial. If a journal publishes an article with conclusions that stand conventional science on its head, I would think that the editors are intentionally inviting controversy – as well they should if they have examined the veracity of the work and stand by it – and would not be “sorry” about the resulting controversy.
Of course, arguing from incredulity is fallacious…. but IMO the editor’s statement seems implausibly naive.
Readers here probably don’t follow linguistics. There is a very entertaining set of posts and threads at the Language Log blog in which linguists torture the neural network that operates Google Translate with meaningless repetitive strings of say Thai characters. The aim is to get the best Dada poetry. The net obliges.
If you’ll permit me to put my UK based nasty classist hat on, being appointed a senior lecturer when you are only around 35 suggests that either you genuinely are a whiz kid, in which case you should have enough capability to realise this is a junk paper, or else the university was desperate to fill a position, and anyway, the newer UK universities are very very variable, and some of them shouldn’t be universities at all.
For you foreigners, the University of Northumbria is based in Newcastle, which is in Northumberland, has about 316,000 people in it and lots more sheep and cattle. The metro area of Newcastle is better at around 1.6 million, but it really is somewhat isolated and this is not a well established university with good roots in the sciences etc.
guthrie,
We are currently in a phase where people are getting promoted a bit younger than used to be the case. We have a number of people who’ve become full Professors before turning 40. I would be reluctant to judge someone on the basis of the age at which they became a more senior academic.
ANNs have been widely used in tidal prediction
and tsunami forecasting, where even with missing data, they can predict hourly, daily, weekly or monthly tidal level accurately.
I don’t use an ANN but a nonlinear solver with input of the yearly signal mixed with the two primary long-period lunar tides to accurately model ENSO. Using an ANN would have probably worked but I had the physical equations worked out via Laplace’s tidal equations and just plugged those into the solver.
@dikran – I can read that too, but we just disgree. On her site JM calls it “using the latest big data techniques”. As you say, this 1991 grNN algoritm is anything but that. As she calls it (one of the) the latest technique I feel that
(a) we should point out that it is not, and
(b) in fact has none of the properties an ANN is thought to have in the many comments above and
(c) grNN in general performs very poorly compared to the actual latest big data techniques.
@guthrie IIRC I was made a senior lecturer at 35ish – but then I had been in post for a decade by then.
@Jan ‘On her site JM calls it “using the latest big data techniques”’ yes, that is laughable, on at least two counts, which is not bad for six words! ;o)
I don’t think the term “neural network” is well enough defined to exclude GRNN (at least without also excluding RBF neural networks, which are probably the second most common variety).
“grNN in general performs very poorly compared to the actual latest big data techniques.”
I’m not aware of a thorough* empirical comparison that really establishes that – the paper has 3298 citations (according to Google scholar) which suggests there are some applications where GRNN is a good fit to the requirements (good predictive performance is not the only requirement, sometimes not even the most important). Machine learning practitioners need to be open minded (except when it comes to fuzzy ;o) and have a big toolbox at their disposal.
* IMHO machine learning tends to be very weak in the empirical evaluation of methods, using too few datasets, too few replications, inadequate model selection (often introducing “operator bias” – the author of the paper being better at driving their method than the baseline methods). I suspect that is why there are hype-boom-crash cycles so frequently (the NN boom in the late 80s and early 90s being a good example).
A couple of my lecturers/ supervisors were permitted to sit down before 40. One was definitely canny enough to avoid bad politics. The other, it didn’t matter because there aren’t really any in inorganic chemistry focused on fuel cells, it’s more important that you raise grant money, which is a sort of politics but not the same as the wider climate change issues.
So anyway, my issue is not so much with youthful appointments, but more with appointing people who seem a bit naieve and inexperienced. And then somehow they end up as journal editors, however that works.
“And then somehow they end up as journal editors, however that works.”
IMHO it is quite simple, there are too many journals for the supply of experienced academics to serve as reviewers, never mind editors.
We sure need more editors than we need auditors. At least until ClimateBall players get replaced by autocorrective Siris.
Bookmark this site to keep up with news on A.I. and advanced computing technology.
http://www.ciodive.com/
Sample news byte:
Microsoft’s Project Brainwave promises real-time AI in the cloud
“capable of 39.5 teraflops of machine learning tasks with less than a millisecond of latency.
This new platform is an important step to achieving AI capable of mirroring the way humans think and enabling AI and machine-based systems to complete previously impossible tasks.”
I’m not inherently opposed to the use of statistical models in place of physical models if their predictive performance can be validated in some way. That doesn’t seem to be what’s going on here.
They don’t mention any cross validation, but they do talk about training periods and test periods. How this would normally work is:
1) Fit a model on the training period
2) Use the model to generate predictions for the test period
3) Compare predictions vs actuals in the test period to assess how good your model is.
What they instead appear to do is:
1) Fit a model on the training period
2) Assume it is is a perfect model
3) Ascribe any error in the test period to the influence of carbon.
I can’t locate any discussion of why we should believe (2).
jacob,
Indeed, and that is – I think – a key point. There may well be cases where it would be suitable to use a statistical model, especially if you are confident you can somehow train it to reproduce how you expect the system to behave. As you say, they appear to have done no such thing.
“They don’t mention any cross validation”
Lots of cross-validation on the ENSO model. The instrumental record is long enough that one can do cross-validation comparing 70 year intervals. Training on one interval will give a highly significant validation test on the other interval. And the same occurs for the reverse. No influence on AGW observed, as would be expected since ENSO has nothing to do with GHGs and energy imbalance.
This is much the same process as used in tidal prediction programs.
Note cross-validation is not a panacea, if you make too many model choices dependent on the cross-validation performance, you will end up over-fitting the cross-validation data instead of/as well as the training data (see my modest contribution). A&M make a lot of model choices!
Optimisation is the root of all evil in statistics/machine learning. If you optimise your model, you introduce the possibility of over-fitting whatever criterion you are optimising. A better approach is to marginalise (integrate over model choices) in a Bayesian manner, however that requires a prior over choices and is often difficult/expensive.
Focus people!
9/1/2017
Russian president Vladimir Putin spoke about the potential power of artificial intelligence Friday, saying “the one who becomes the leader in this sphere will be the ruler of the world.”
China is committed to massive investment in genetic engineering, artificial intelligence and biotechnology. By 2020, China’s AI technologies and research facilities will match other leading countries, said Li Meng, the vice minister of science and technology. Five years later, he expects “a big breakthrough,” and then China should finally become the global “innovation center for AI” by 2030. The remarks at a press conference expand on a policy statement released by China’s State Council in July, which set out goals to build a domestic artificial intelligence industry worth nearly $150 billion in the next few years. Eight out of nine members of the Politburo Standing Committee of the Communist Party of China have engineering degrees.
But we have Trump!?
Pingback: Är jag en klimatjihadist? Mörner och klimatkänsligheten, del I – Maths Nilsson, författare