Research Integrity

Stephan Lewandoesky and Dorothy Bishop (whose blog I used to read quite a lot, but haven’t for a while) have published a comment in Nature about Research Integrity, arguing that we shouldn’t let transparency damage science. It’s a complex issue, but I think they make some interesting points and a number of the usual suspect kindly turned up in the comments to illustrate some of what they were trying to suggest. The same usual suspect were most put out when some of their comments were later deleted.

The key issue, in my view, is that everything necessary for the results of a study to be evaluated and reproduced should be made available. However, that is not the same as making every single thing associated with a particular study available to anyone who asks; it should simply be possible for someone else to check and reproduce what’s been done before.

I can’t speak for other fields, but in my own field most data is either publicly available, or will soon be publicly available. Most methods and techniques are well understood and there are often resources available so that you don’t necessarily have to write your own analysis codes. Most computational models are also publicly available, or something equivalent is publicly available. So, if someone wants to check a published result they simply have to get their hands dirty and do some work. That doesn’t mean that the authors of the original work shouldn’t answer questions, or clarify things; it simply means that they shouldn’t be expected to simply hand over everything that they’ve done simply because someone asks for it. If anything, if someone is incapable of redoing the work themselves, then they probably aren’t in a position to critique it in the first place.

I am, however, certainly not suggesting that researchers shouldn’t hand over more than is necessary. There’s no real reason to not be reasonable and in many cases the requests are, themselves, entirely reasonable. On the other hand, scientific understanding progresses via people actually doing research (whether it’s new, or an attempt to check another result) not people sifting through other people’s work looking for possible errors.

Of course, this is my view based on my own experiences and what is the norm in my own field. It may well be different in other fields and may be different in other circumstances. Maybe when human subjects are involved, or when the results are particularly societally/politically relevant, we should expect more. On the other hand, if research is fundamentally about gaining understanding, maybe we should simply trust the scientific method. We shouldn’t trust scientific results simply because they’re published by people who we trust and regard as being experts in their field. We also shouldn’t distrust scientific results simply because we don’t trust those who did the work, or because we don’t like the result. We start to trust a scientific result when it has been replicated and reproduced sufficiently. That requires doing actual work, not simply checking what others have done so as to try and find mistakes.

Advertisements
This entry was posted in advocacy, Science, Universities and tagged , , , . Bookmark the permalink.

183 Responses to Research Integrity

  1. John Mashey says:

    Note that replication does not mean merely running the same code on the same data and getting the same result 🙂

  2. Magma says:

    I thought the comment was generally quite reasonable, and the ‘red flags’ and ‘double-edged swords’ sidebars succinct and useful.

    Data confidentiality in the case of participants’ identities is long-standing with respect to studies in medicine and health, and sometimes applies with respect to data obtained by agreement with third-party organizations in the natural sciences. (The latter situations have to be judged on a case by case basis for publication.)

    However I am strongly opposed to the implied suggestion that data release may be screened in the case it might be ‘misused’:
    Even when data availability is described in papers, tension may still arise if researchers do not trust the good faith of those requesting data, and if they suspect that requestors will cherry-pick data to discredit reasonable conclusions.

    This is simply a risk that researchers have to take. Not all reanalyses are as faulty as the “no warming since last Tuesday!” ones we are all familiar with.

    Note: I was a volunteer abstract classifiers in the Cook et al. study. I think it was an imperfect but useful study. I also wonder at times if Lewandowsky is running a larger meta-experiment.

    Psychologist: We’ve observed phenomenon ‘X’ in social media settings.
    [Ensuing flood of angry responses demonstrating phenomenon ‘X’.]

  3. Michael Lloyd says:

    In many countries there will over-arching FOI legislation to cover requests for information. Researchers cannot be expected to be expert in complying with such legislation and, therefore, it is imperative that they refer requests, of which they are leaning towards refusing to supply all, or part of the requested information, to the appropriate professional staff in their institutions.

    I do not disagree with the scientific points you are making but I am concerned with sentences such as this:

    ” it simply means that they shouldn’t be expected to simply hand over everything that they’ve done simply because someone asks for it.”

    Researchers should not go it alone on this. The CRU learnt this lesson the hard way, and to quote Santayana:

    “Those who cannot remember the past are condemned to repeat it.”

  4. JM,
    Exactly.

    Magma,

    However I am strongly opposed to the implied suggestion that data release may be screened in the case it might be ‘misused’:

    Yes, I do agree. I think the default should be that the data be made available.

    Michael,

    I am concerned with sentences such as this:

    ” it simply means that they shouldn’t be expected to simply hand over everything that they’ve done simply because someone asks for it.”

    Researchers should not go it alone on this. The CRU learnt this lesson the hard way, and to quote Santayana:

    I’m certainly not suggesting that anything necessary should be withheld. I was thinking more along these kind of lines. I mostly do simulations. Let’s say I run a suite of simulations, analyse the ouput, and publish a paper. Someone then contacts me because they would like to check my results. My first step would probably be to point them to where they can download the code and the documentation. Then they come back and say “I don’t have a suitable computer, can you give me some of your ouput”. So, I give them the output. Then they say “I don’t know how to read that output”. You then send them some code for extracting information from the simulation output and explain how to use it. They then come back and say “Okay, I got the output from the simulations in a form that I can use, but I don’t know how to actually analyse it”. You send them some more code that allows them to actually analyse the ouput from the code. Etc, etc,. At what point is it reasonable to simply stop and say “sorry, but I can’t help you anymore”?

    Of course, in some cases all of the above should be available up front. In other cases, it is so trivial that anyone familiar with the topic should be able to do it themselves. So, I’m simply suggesting that researchers are not responsible for ensuring that others have the requisite knowledge and ability to check their output; they’re only really responsible for ensuring that it is possible to do so.

  5. In economics, the norm is that data and code are available.

  6. Richard,
    And if you’d read my post, you would have noticed that in my field the norm is also that the data and code is available.

  7. Although, maybe I should add that in my field if I take some output from a code and analyse it in some way (multiply, add, subtract, divide, integrate, …. various output variables) to produce a figure, we would typically expect anyone who wants to check the result to be able to do that calculation themselves.

  8. Michael Lloyd says:

    “At what point is it reasonable to simply stop and say “sorry, but I can’t help you anymore”?”

    When you get to this point or look like you could get to this point, you should get professional advice.

  9. Actually, the above is an interesting issue. I’ve, on a number of occasions, provided data for others. In all cases it’s been because they simply wanted to do something with it, so maybe it’s not directly comparable. However, in the cases where I simply hand over the data, I’ve never been, or expected to be, an author on the resulting paper. In the cases where they actually needed some extra analysis of the data, I have been. So, what happens if someone is explicitly trying to check the results in your paper and, to do so, requires a great deal of help from the original authors? In my field, that would typically mean that the original authors should be authors on the paper. Would, however, be rather odd if the resulting paper is highly criticial of the original result. Might suggest that those checking the original paper should be ensuring as little influence from those whose work they’re checking.

  10. When you get to this point or look like you could get to this point, you should get professional advice.

    Ahh, yes, I see what you mean. I hadn’t thought of that, but I completely agree and it is rather obvious in retrospect.

  11. Richard,
    I should probably have added, that in my field we don’t typically call calculations using Excel a code 🙂

  12. Pingback: Risparmi - Ocasapiens - Blog - Repubblica.it

  13. Hyperactive Hydrologist says:

    In some fields data is often obtained through third parties and therefore needs certain permission to freely release that data. I think this was the problem with some of the data the CRU used in it’s analysis. This is often a problem with hydrological and meteorological data particularly in the UK and certain parts of Europe.

  14. HH,
    Yes, I realise that that is one issue. It’s a tricky one. Someone says “I’ll give you my data/code as long as you don’t pass it on to anyone else”. Do you publish a paper knowing that you can’t release everything, or do you simply do nothing? Ideally, the third party shouldn’t have imposed the condition in the first place, but it’s not a trivial issue, especially if the third party won’t budge.

  15. “I should probably have added, that in my field we don’t typically call calculations using Excel a code :-)”

    To be fair, I believe FUND is publicly available and free of charge.

  16. Hyperactive Hydrologist says:

    Or you say the data is available from a certain third party and if someone wants to check your result they would have to request the data themselves. In hydrology if you only used freely available datasets there would be no new research.

    Likewise for code this is often seen as intellectual property and I don’t believe it needs to be made freely available. If the methodology is explained in the paper then, with the right data set, it should be reproducible.

  17. To expand a little on Excel (and I realize your comment was merely an off-hand dig at RT) it is a very powerful yet easy to use tool. In my work I use it to do Monte Carlo simulations to better understand the propagation of measurement uncertainties with different distributions. Different assumptions can be tested in a few seconds to check significance.

    The benefit is that not only don’t I need to learn R (or similar statistical program), but those who need to learn from and or review my work don’t need to learn R either. They almost always already know Excel and can quickly and easily check out the formulas being used and/or change parameter values to explore their own scenarios.

    I’ve especially found it helpful in determining sensitivity coefficients that are not intuitively obvious and would otherwise require hours of research and study.

  18. HH,

    Likewise for code this is often seen as intellectual property and I don’t believe it needs to be made freely available. If the methodology is explained in the paper then, with the right data set, it should be reproducible.

    Yes, I tend to agree. In fact, in some cases, the process of developing the code is an important part of understanding the topic.

    oneillsinwisconsin,

    To expand a little on Excel (and I realize your comment was merely an off-hand dig at RT) it is a very powerful yet easy to use tool.

    Yes, it was just a bit of a dig. To be fair, Richard does seem to make everything available and I wasn’t actually thinking of FUND. The only time I’ve downloaded anything of Richard’s it was in Excel. You might think I was being snobbish, but I use Fortran, so that would seem somewhat ironic if I was 🙂

  19. @HH
    You should check the MIT License, a clever bit of legal thinking worthy of its name.

    Papers are rarely sufficiently detailed to recreate the code. Releasing code is therefore a key part of replicability.

    @ONeill / Wotts
    The language should be suited for purpose. I’ve released code in Excel, TurboPascal, Modula2, C+, C#, Java, Julia, Matlab, Gretl, GEMPACK, and GAMS. Working on Stata.

  20. Richard,

    Papers are rarely sufficiently detailed to recreate the code. Releasing code is therefore a key part of replicability.

    Then they’re bad papers. Given that I’ve on more than one occasion developed codes by simply working through other papers, I disagree with the suggestion that “papers are rarely sufficiently detailed to recreate the code” – in my field, at least.

    The language should be suited for purpose.

    Of course, I wasn’t really suggesting otherwise.

  21. > The language should be suited for purpose. I’ve released code in Excel, TurboPascal, Modula2, C+, C#, Java, Julia, Matlab, Gretl, GEMPACK, and GAMS. Working on Stata.

    You forgot GREMLINS.

    What purpose serves that language?

    ***

    There are better ways than FOIAs:

  22. John Mashey says:

    Around climate science, a lot of people demand code and criticize mainstream researchers if they can’t get it, even if they actually couldn’t use it or even understand it. (“simulation code is hidden!!” Q: how well do you understand F90?” “F90??”)

    I’d say a good rule of thumb is:
    we’ll consider the request after you demand and get the UAH satellite code pulbished, which:
    1) has not been very well documented publicly
    2) has often proven buggy
    3) and is treated by some people as the *only* correct representation of Earth’s temperature since 1998, ignoring all conflicting data.

  23. When I published my paper on benchmarking statistical homogenization methods, I got several questions about the availability of the testing data. As soon as I told them the URL they all lost interest, most instantaneously.

  24. A rare occasion to for me to disagree somewhat with John Mashey.

    ” replication does not mean merely running the same code on the same data and getting the same result”

    This is not replication, but it is “reproducibility”, which is a very useful constraint for scientific computation.

    Running the same code on the same data and getting the same result is far less trivial that it might appear, and a capacity for doing so is enormously beneficial to a productive workflow. Yet it’s far less commonly attended to that one might expect.

    The usual person doing such replication is the original researcher or a collaborator. Legitimate competitors ought to be encouraged as well. The use case of the internet troll looking to nitpick good science to death is what we are worrying about here. It’s sort of a case of the terrorists winning. We can’t abandon legitimate advances in the scientific process because of the possibility that they can be abused.

    I agree, though, that responsibilities have to be symmetric. Critics and gadflies need to be as open with their discourse as they expect the mainstream scientists to be.

  25. Mitigation sceptics have harassed Phil Jones by a campaign where everyone would ask CRU for the data for 5 random countries and the sharing agreements for those 5 countries. That is pure harassment and it is completely reasonable not to give in to that kind of abuse.

    I think it was the Nature article that mentioned that sometimes patients were promised their data would be used for a certain purpose. If possible we should try to refrain from such promises, but once you made them, you cannot simply put the data on the web or share them with unethical people. Sometimes the full raw dataset allows for people to be identifiable. That would also preclude sharing everything.

    In case of meteorological and hydrological research the organizations gathering the data often want to sell the data to get some of the investment back. It is getting better and I hope this goes away. Iwould advocate a global climate treaty to agree that everyone shares all environmental data, but as long as that is the case, it is unfortunately not always possible to share data.

  26. MartinM says:

    Mitigation sceptics have harassed Phil Jones by a campaign where everyone would ask CRU for the data for 5 random countries and the sharing agreements for those 5 countries. That is pure harassment and it is completely reasonable not to give in to that kind of abuse.

    Yes. And what that case established beyond doubt is that it doesn’t matter how much you do make available, a sufficiently unreasonable person can always find something to complain about. 98% of your data is publicly available? Your algorithm is described in publicly available papers, clearly enough that anybody half-way competent can code up their own version in an afternoon? Not good enough, apparently!

    Which is not to say that it’s not worth making things available, of course. It certainly is. But when doing so, trying to appease the cranks is a waste of time and effort.

  27. Andrew dodds says:

    OK, since I not only write code for a living but also have to work out what people have done with that code..

    I would not want the code itself – it’s seriously pointless upfront. I would want a very very good description of what the code should do if it works, so to replicate I would want to have a description to follow. Trying to work out if nontrivial code implementing a complex algorithm is doing it right by inspection is close to impossible to do directly. Reimplementation is faster. You only want the code when you have a definite descripancy.

    So.. Release code to people who can demonstrate good reason to question it. Otherwise, they don’t need it.

  28. Andrew Dodds, no, I disagree.

    This whole conversation builds on the absurd idea that the only people who want to examine your work want to destroy it, not build upon it. The first step is to build the code and replicate extant results. That prevents a whole lot of wheel-reinvention.

    Finding bugs is not the normal purpose of replicating results, but it can be a useful side effect.

    The climate science world has become so bizarre that we expect people who are looking at our work to be trying to viciously, and even maliciously and falsely denigrate it. That is not a usual use case, and we shouldn’t base our actions on policies upon it to the exclusion of an appropriately collegial approach to science, which is very much served by due attention to facilitating replication.

  29. MT,

    This is not replication, but it is “reproducibility”, which is a very useful constraint for scientific computation.

    Okay, I may need to make sure I understand your point, but as I understood John Mashey’s point it was taking exactly the same code with exactly the same initial conditions. Doing that would, I think, tell us nothing. What I think you mean is someone else developing the same code and then testing it by using the same initial conditions as the other code. That, I agree, has value and is indeed something I’ve done myself. Someone simply taking the code that I wrote (for example) and running it with exactly the same conditions is what I think is of little value. Of course, if they were then going to do something new with the code, and were simply learning how to use it, that would have value, but as a test of what I’ve done, it doesn’t really.

  30. Andrew dodds says:

    Mt –

    Yes.. But from a development level, trying to expand someone else’s code is hard – very, very hard. No matter how well written. So the question is; how do I know that what I have changed has not fundamentally broken what was there before. And the only way to answer that question is to basically reimplement it (or do the equivalent test set).

    Module reuse is fine – but you don’t need source code for that.

  31. But from a development level, trying to expand someone else’s code is hard – very, very hard.

    In my experience, this is not that unusual. Partly, that’s because quite a few of the codes available are general purpose hydro and MHD codes, and so they are partly designed to that people can implement new routines. As long as you don’t fiddle with the core part of the code (the bits that solve the hydro/MHD equations) you’re normally pretty safe, as long as you do then test it carefully.

  32. John Hartz says:

    ATTP: Kuddos on your OP and to all the commenters who are engaging in a very civil discourse. I for one am learning a lot from this.

  33. John Mashey says:

    MT (hmm, “teaching grandmother to suck eggs” :-), an aphorism I never udnerstood, having grown up on a farm with chickens, and I never saw Grandma suck eggs.)

    I don’t think MT and I really disagree …
    a) My comment of course was (perhaps too subtly) directed at the Wegman Report, of which Joe Barton (R TX) said:
    “I have never met Dr. Wegman. We asked to find some experts to try to replicate Dr. Mann’s work” Search for “replicat” in Strange Scholarship in the WR.
    Of course, <a href="as shown by Deep Climate in Replication and due diligence, Wegman style Wemgan+team did nothing more than rerun McIntyre’s code, and then Ed Wegman Promised Data to Rep. Henry Waxman Six Years Ago – Where Is It? Of course, it’s no wonder Wegman refused to provide it.

    b( Reproducibility of results from code+data is something different, and oddly something for which I have some experience, starting from the late 1960s. For example, people wrote Fortran programs for research, and rerunning them would (generally) produce the same results. However, it was really, really a good idea to run at least shorter runs via WATFIV, which checked for undefined variables and out-of-range subscripts. I suspect many papers were published based on runs that would have blown up under WATFIV 🙂

    More seriously, reproducibility was one of the challenges we faced ~1988 when we started the SPEC Benchmarks.
    Programs were compiled and then run with specific input, and output compared to a reference.

    a) Some kinds of programs+input data must give bit-of-bit-identical answers, easily checked.
    Those are clearly reproducible across platforms, and lack of such generally implies compiler or library bugs, hence a good Q/A check.
    d
    b) Some kinds of (floating point) programs give slightly-different, but still useful answers, without drastic differences in operations counts, arising from differences in low-order bits. Those can be handled via fuzzy compares that allow for small differences. For example, IBM S?360, VAX and IEEE floating point are not identical. Neither are 64-bit vs 80-bit. Finally, it is generally to useful to have a floating multiply-add, which is faster, and preserves precision, but even on same machine, if separate multiply and add operations are used, the results can be different.
    For many programs, the differences are irrelevant, but some can be very sensisitve, and all thsi was pretty well understood ~25 years ago.

    c) Occasionally, this mattered. Timberwolf was a benchmark we really wanted to use,a s it was substantial code and data, doing simulated annealing for chip layout. Unfortunately, it used a tiny amount of floating point, and made decisions based on low-order bit. Every platform could compile and run it, but we got at least 3 distinct (and wildly-varying) iteration counts. Amusingly, one vendor had machines with 2 different CPU types, A and B. A greed with another vendor’s type-A, and B agreed with other vendor’s Type C. All results were reasonable, all generated feasible solutions, with close agreement on total chip real estate, so in some sense, the results were were reproducible, but we couldn’t use it as a fair performance benchmark.

    Finally, in the 1990s I used to collaborate with SAS Institute, whose (floating point) code needed to produce *identical* results across a wide variety of platforms. That is very, very hard work. They always wanted source code for all math libraries they used.

    Anyway, I think replication and reproducibility overlap, but are not the same/

  34. snarkrates says:

    A historical digression by way of illustration:

    When Edward Teller invented the H bomb, it was an huge, complex and unreliable system the size of a building. It had to be towed on a barge to the site of the test. In no way was it a workable weapon. However, because the US had gotten a lot better at finding Soviet spies, the Russians did not know this. They thought the US had developed a deliverable H bomb, and dry cleaning bills for trousers increased.
    Andrei Sakharov racked his brains for a solution. Finally, he came up with the idea of using thin layers of lithium deuteritde and lithium tritide–the layer cake design. It worked, and the Russians briefly had a deliverable hydrogen bomb when the US had bupkes.

    The scientific method does not require a researcher to release absolutely everything and then hold the hand of every moron who doubts the result. Nor is that advantageous. And as the story above illustrates, the first solution presented might not be the best, and it could stifle the creativity needed to develop better solutions down the road.

    I am all for releasing raw data–at least after a certain point that allows the folks to took the data to do their best on it. I am all for releasing a thorough and clear description of the methodology that allows it to be judged and replicated and preferably improved upon. I’m all for making well written code freely available to colleagues within your research group so they can check it for errors, suggest improvements. I see little value in releasing code outside the research group unless skilled competing groups have had trouble replicating the result. It can wind up perpetuating errors rather than correcting them. I also don’t see much value in releasing processed data for the same reason. A skilled researcher will be able to replicate processed data if not improve upon it provided the data are correct.

    Replication is part of the scientific process. Repetition is not. Auditing is not. In 20 years, despite all his protestations and all the trouble he has caused, the [Mod : Auditor] has not advanced the understanding of Earths climate one iota. Maybe if he spent some time actually developing analyses of his own rather than attempting to nitpick to death the analyses of others, he might have actually made a contribution to science.

  35. John Mashey says:

    “So the question is; how do I know that what I have changed has not fundamentally broken what was there before. ”

    While it is no panacea, this is why serious software engineering product groups spend much energy creating regression tests, and sometimes elaborate software and even hardware test harnesses. For example, back at Bell Labs in the early 1970s, others in my department built UNIX-based software that could test mainframes, i.e., by simulating multiple people sitting at IBM 3270 terminals.

    The extent to which this makes sense depends on what you are trying to do, ranging form “serious consequences if new release blows up” down to “code for this one paper”.

  36. Windchasers says:

    This whole conversation builds on the absurd idea that the only people who want to examine your work want to destroy it, not build upon it. The first step is to build the code and replicate extant results. That prevents a whole lot of wheel-reinvention.

    Reinventing the wheel is part of the point. Or at least, rebuilding the wheel with your own hands and tools, to show that it’s being done correctly.

    Get in there, get your hands dirty, learn how it works and how it’s done. And with that experience, you’ll gain the expertise necessary to critique how the wheel was built, and you’ll maybe learn how you might improve the process.

  37. anoilman says:

    You need to be an ally of the US with Top Secret Clearance, as an oceanographer for full access to the Levitus database. So… no.. not available, and I’d imagine that hanging out with strange men with money (GWPF, et al) wouldn’t exactly endear you to the US intelligence services. In fact the raw data-set would give you access to ships that aren’t there. So lets share it for science?

    Oh look here’s Levitus! He must be in cahoots with Al Gore! 🙂 Its a conspiracy!
    https://www.nodc.noaa.gov/OC5/3M_HEAT_CONTENT/

    (If anyone’s interested, a long time ago I worked with the Navy with this data.. with XBTs. I would be amazed to hear information that all allied navies are faking it for greens or some such.)

    Besides what does the navy know about all this gathering and processing of data? Not much I guess. I mean they only gather it…

  38. anoilman says:

    Then there’s all that pesky private data. I mean some scientists are under various IP agreements. Early last year, merely speaking on data was a serious offense for a Canadian scientist. God forbid they should actually share it!

    Any conservative minded people in the audience should really employ a small force like PMO Science Pest Control;

  39. anoilman says:

    Richard Tol: Anders: In industry… everything is hidden and many papers are mere advertising that they can, “do it better.. don’t ask how.” (This applies double for the oil industry. We don’t share. We really don’t.)

    Many scientific endeavors have similar constraints. Such as multinational access to raw tree ring data. One part is who gathered the data and why. (Tree rings are used by the forestry industry, and are a means of evaluating regional productivity. Give that info to your competitors? For science? I think not!)

  40. Brandon Gates says:

    In cases where all data cannot be released due to its proprietary nature, I think a more reasonable … request … is for at least the data and error bounds for all the figures in the published paper to be made freely available as part of the supplemental package. I cannot count the number of times I’ve wanted to combine the results of several studies together and not been readily able to because one or more of them only give me the plots and not the data behind them.

    I however also realize that major journals are not geared toward catering to non-expert citizen blog scientists, or as I more think of myself, a hobbyist with above-average lay interest but non-professional skills, intended use … and very likely many erroneous results. That sense of place is either lost on, ignored by, or complained about by those issuing nuisance requests, whose bad faith is transparently evident to everyone who doesn’t suffer their same delusions.

    But it’s a heck of a ClimateBall move because, after all, people who don’t have anything to hide shouldn’t have anything to fear by letting the whole world in on their data, emails (personal and professional), phone numbers and records, work and home address, political affiliations, religious beliefs (or lack thereof), shopping list, banking records, next of kin, entire sexual history, medical records, birth certificate, passport number, favourite flavour of ice cream and anything else conceivably required to hold them accountable for being the liars and cheats “everyone knows” they already are even without all that information.

  41. I have to say, that I do agree with this comment from MT

    That is not a usual use case, and we shouldn’t base our actions on policies upon it to the exclusion of an appropriately collegial approach to science, which is very much served by due attention to facilitating replication.

    We should, ideally, assume that request are – by default – reasonable, as most will be. I guess the issue is trying to determine when this is no longer the case and how to deal with situations where the requests are becoming malicious or a form of harassment.

  42. Andrew dodds says:

    aTTP –

    Yes, I have to admit that between commercial software development with multiple competitors working with the same client, and following climate ball over the years, the idea of open, collaborative, no-blame behaviour seems a bit utopian..

    John M –

    Software regression testing is one of those subjects that gets more intractable the more you think about it. After all, once your test suite/client has reached a certain level of complexity, how do you know that it actually works? Nothing is more suspicions than a set of tests that all pass..

  43. Michael Lloyd says:

    “I guess the issue is trying to determine when this is no longer the case and how to deal with situations where the requests are becoming malicious or a form of harassment.”

    And you can hand these cases over to the professional staff , whose job it is to deal with them.

  44. Michael,
    Of ocurse I agree that passing it over to the professional staff is that way to proceed formally. I was thinking more about what you do if there are still continued claims that not everything has been released. I suspect there is really little one can do about that, other than ignore it’s been passed to the professionals.

  45. snarkrates says:

    Michael Lloyd, In the real world, if a result is of sufficient interest, there will be no shortage of researchers who want to build upon it and in so doing replicate and confirm the result. I, as a researcher, should be happy to render assistance in these efforts. It is in my interests as a researcher to see my research disseminate widely.
    I am, however, under no obligation to collaborate with cranks, the mentally ill or arseholes. I do not know how you have reached the mistaken impression that research institutions are chock-a-block with professionals whose job it is to interface with the public. Most research institutions are very lean organizations. We even have to empty our own trash cans. A single crank can significantly cut into the time a research group has to do actual research.

    Bottom line: If you would like cooperation or collaboration, don’t be an arsehole.

  46. If you would like cooperation or collaboration, don’t be an arsehole.

    Yes, I agree, but I think there are people at universities who can help when these kind of situations get to the point where the academic may no longer know how best to respond. So, Michael is probably right that you can – and should – hand it over to others when it gets to that stage.

  47. snarkrates says:

    Anders,
    I used to work in physics journalism–at a journal that had “Physics” right there in the title. In fact the institute where the magazine was located and even the street address had “Physics” in it.. Every other week or so, the magazine would receive a visit or a manuscript from someone who claimed to have proven Einstein wrong. Very occasionally, we’d get someone who would claim to have proven the 2nd law of Thermo wrong. When this happened, especially if said individual (always male, by the way) was persistent, I would get the call to come down and talk to them. My sole qualifications for this were a couple of undergrad classes in abnormal psych (I call them the most useful physics classes I ever had) and a somewhat soothing deep baritone voice.

    Michael Lloyd makes it sound as if research organizations “have people for this sort of thing.” In general, they don’t, just as we don’t have people anymore who do our graphics or type our articles or take out our trash. Security often can’t be bothered unless the situation escalates near violence. Legal affairs has their minds set on other issues (mainly intellectual property, as that is where the money is). PR is too busy trying to figure out how to make a headline without utterly distorting the latest research. And the last thing your supervisor wants is to have to talk down a bipolar during his manic phase. Increasingly, we’re on our own. The frauditor cost CRU hundreds of hours of research time dealing with frivolous FOI requests that never amounted to an iota of scientific progress.

    It isn’t and shouldn’t be our job to convince the “slow students” of the correctness of our research.

  48. > It isn’t and shouldn’t be our job to convince the “slow students” of the correctness of our research.

    I guess then we can add this to the list of why it’s so great to be a student:

    Unless by “slow student” you don’t even refer to a student, snarkrates?

    ***

    As Harry says in Kingsman, quoting Ernest, there is nothing noble in being superior to your fellow man; true nobility is being superior to your former self.

  49. Gator says:

    Willard, it is quite obvious who the “slow student” is, IF you actually read the post.

  50. snarkrates says:

    Indeed, I refer to those who refuse to learn because the facts don’t fit their ideology. I’m all for outreach for those who are so inclined. However, the cause of science pedagogy would be furthered if some scientists who have neither inclination nor skill at outreach did not feel the need to do so. Those who are best at furthering research should do research.

  51. > it is quite obvious who the “slow student” is

    It is even more obvious that the word “student” is contained in the expression “slow student,” Gator. It’s also quite obvious that the expression is used as an ad hominem.

    What might be less obvious is that the Einstein anecdote is a trope that creates an equivalence class using the ridiculest denominator. This RHETORICS ™ trick operates the same way as the D word. One particularity is that it begs the question of the auditors’ competence. We could pay due diligence to what appears to be a very misguided misconception. This would imply, however, that we compare it to the ones who insist in insinuating they’re idiots.

    Businessmen may not always be gentlemen, but they can live in a world without the same institutional support as academics, so I’d thread that one lightly if I were you.

    ***

    If you don’t want people to stop acting like arseholes and be treated by them with respect, it might not be the best idea to insinuate they’re idiots.

  52. Raymond Arritt says:

    Yes, our code and data should be freely available to the extent legally possible. But I’m also reminded of Gerald North’s comment regarding Steve McIntyre:

    “McIntyre to me, I think he is probably a well meaning guy. He’s not dumb, he’s very smart. But he can be very irritating. This guy can just wear you out. He has started it with me but I just don’t bite. But there are some guys, Ben Santer comes to mind, who if they are questioned will take a lot of time to answer. He’s sincere and he just can’t leave these things along. If you get yourself in a back-and-forth with these guys it can be never ending, and basically they shut you down with requests. They want everything, all your computer programs. Then they send you back a comment saying, ‘I don’t understand this, can you explain it to me.’ It’s never ending. And the first thing you know you’re spending all your time dealing with these guys.”

    The question is where to draw the line between openness and collaboration versus allowing oneself to become subject to a denial of productivity attack.

  53. The question is where to draw the line between openness and collaboration versus allowing oneself to become subject to a denial of productivity attack.

    Indeed, but there’s also the issue I mentioned here. If I do a lot work helping someone with some research they’re doing, then I end up being an author. So, even if the person doing the checking has the best of intentions, they’re surely better off trying to remain as independent of the original researchers as is possible, partly because they may then end up in the awkward position of having to ask if they’d like to be authors on a paper that claims that their original research is junk.

  54. Michael Lloyd says:

    @snarkrates

    In the UK Freedom of Information laws apply to public authorities, which include, inter alia, Universities, Research Councils, Government research establishments, NHS Trusts, Central and Local Government and Police Services.

    Part of my job at the university was to deal with Data Protection law, Freedom of Information law, Environmental Information Regulations and information security. There were counterparts to me and my staff in most if not all other public authorities.

    You can see the structure for Edinburgh University here,

    http://www.ed.ac.uk/records-management/freedom-of-information/our-approach/our-srtucture

    Data Protection law and the Environmental Information Regulations are derived from EU Directive and are implemented across the EU. I would expect similar roles to mine dealing with this legislation exist across all 28 member countries of the EU.

    The default position is that there is a strong likelihood that professional staff are available.

    Oh! And I live in the real world too and some of it has been pretty murky.

  55. snarkrates says:

    Michael Lloyd,
    I live in the US and work for the ebil gummint. Given that we are expected to empty our own trash cans, how likely do you think it is that we will have support staff to handle FOI requests?

    And it is beyond question that the [Mod: Auditor’s] requests and those of his polyps cost hundreds of hours at CRU that could have been devoted to research. Not one single publication has resulted. Not one iota of understanding of Earth’s climate has been gained from these efforts. The time of skilled researchers is a finite and critical resource when it comes to understanding how deep we are in the swamp. Do you seriously propose that we should squander that resource on nuisance FOI requests?
    I am all for making raw data and a through description of methods. There are very good reasons NOT to make intellectual property available–and this includes code and processed data. These reasons go beyond financial considerations. There is also the possibility that the processed data and code could contain errors that the [Mod: Auditor] and others don’t understand and that then could creep into the work of others. There is also the possibility that in replicating the results, other researchers will improve upon the original.

    Science works. Let it.

  56. Michael Lloyd says:

    snarkrates,

    A couple of points.

    One, I am not responsible for passing these laws. If you are not happy complain to your senator or congressman.

    Two, if CRU had passed those requests onto to UEA’s professional staff a lot sooner, they would have wasted far less of their valuable research time.

  57. RT: Papers are rarely sufficiently detailed to recreate the code. Releasing code is therefore a key part of replicability.

    ATTP: Then they’re bad papers.

    By this measure, most economics papers are bad:
    http://www.federalreserve.gov/econresdata/feds/2015/files/2015083pap.pdf

    Code that takes you all the way from raw data to results should be routinely available as an appendix to a published paper. One can’t make such a simple statement for data because the provenance is more varied, but proceed in the same spirit.

  58. Federick,
    Maybe that’s one of the key issues then. In my experience in physics/astrophysics, it is typically possible to replicate someone else’s code by working from their paper. If that’s not true in all disciplines, then maybe those for which it isn’t true should indeed be making other things available so that replication is possible. Also, maybe people in different disciplines should be careful of telling people in other disciplines how to behave, or judging them on the standards that apply to their discipline but not to all.

    Another factor, possibly, is that typically in physics/astrophysics, the actual equations that underpin a code/model are well established. Different codes then use different methods to solve the same equations. So, if you make a code available, you’re typically not making a code available that solves a different of equations to other codes, you’re simply making availble one way of solving those equations. If you don’t make it available that would typically not mean that others don’t know what equations you were solving.

  59. Code that takes you all the way from raw data to results should be routinely available as an appendix to a published paper. One can’t make such a simple statement for data because the provenance is more varied, but proceed in the same spirit.

    Maybe we’re also thinking of different types of codes (in some cases, at least). I’m often referring to big simulations that generate output (that maybe we’d call data). In some cases, I think people are referring to the code that one uses to analyse some data.

    Certainly in my experience, big simulation codes are often available. Various routines for analysing data from telescopes is often available. People do, often, have to write extra bits of code to do the extra things that they want to do. This may not always be available, but is typically sufficiently explained (or standard) so that others can replicate without having the exact piece of code that the other researchers used. My own experience is that you would typically simply use your own, or write your own, bits of code. In my view, this is quite an important part of understanding what was actually done and what it is you’re doing; essetntially people are discouraged from using codes as if they’re a black box the inner workings of which you don’t really need to understand. It might be different in other disciplines, though.

  60. dikranmarsupial says:

    Even is papers don’t provide code, they should at least specify the particular algorithm used for, e.g. piecewise linear regression or computing the confidence intervals. However one of the difficulties is that we may not be aware of how much information would by required by others to re-implement the method and/or that there may be more than one way of doing something (and that the “standard” method might not be the same in all fields). We should be (a) understanding when papers don’t provide exactly what we want and (b) happy to answer questions to fill in the gaps in the information and (c) not assume that every apparent gap in provision or inconsistency is evidence of malpractice.

  61. ATTP: maybe people in different disciplines should be careful of telling people in other disciplines how to behave, or judging them on the standards that apply to their discipline but not to all.

    Maybe. The starting point of this discussion (Lewandoesky & Bishop’s intervention) does present the problem as a general one across disciplines. And, I think that many scientific disciplines do have in common an openness problem, whether because people want to maintain their advantage, or because they want to hide the processes of model- and data selection that led to achieving statistical significance (the problem of an implausible proportion of reported results coming just below the 5% significance threshold has been well documented in biology, economics and psychology, at least), or because their commercial sponsors or own commercial interests or deep seated beliefs need a certain research result. So it makes some sense to approach it as a common problem across disciplines. On the other hand, Lewandoesky & Bishop are using the harassment that has visited scientists in a few fields, such as climate, as a general excuse for refusing to share with people who are regarded as hostile. Lewandoesky has made clear, in subsequent interventions (e.g., http://www.shapingtomorrowsworld.org/lewandowskyNatureOD.html) that he thinks people requesting data should be screened on the basis both of motive and of analytical competency. He clearly fears that if people who are wrongheaded or incompetent get the data, they will do something BAD with it. Two points about that: first, what he’s advocating is not open data, not at all, despite his continued declarations on that point. Second, people can, and do, make bad arguments with all sorts of data that is already publicly available. Is the spectre of more bad arguments reason to keep data hidden? I loved his “conspiratorial ideation” work on climate change deniers, but here he’s barking up the wrong tree.

  62. I’ll have to think about your comment a bit more, because you make some good points. A quick comment now, though, is that – IMO – it’s about replication. If the data is completely inaccessible then it should obviously be available. If, however, it is possible to reproduce the data (or equivalent data) then it’s a more complex issue. Then, if someone thinks it’s very important to do that work again, you don’t need to the original data, you just need to be able to produce a data set from which you can do the same analysis. To be clear, I’m not suggesting that this means that the original data should not be provided, but suggest that serious researchers could do the replication even if the original researchers are reluctant to make their data available.

  63. izen says:

    Hopelessly idealistic to think that data, processing code and all results can ever be freely transparent. Science might benefit from such a Utopian environment, but it operates in the real market.
    Any data, code or results that has a cost or benefit in economic terms, or even the potential to enable profit or cause loss will not be free. I find Victor’s idea that governments might agree to make climate and hydrological data freely available when a profit could be made, sadly naive.

    As the oilman has indicated industry shares very little. Willard highlighted the situation in medical research where commercial sensitivity and proprietary information means that open sharing rare, and distorted presentation of results occurs, much more commonly than the distortions that occasionally emerge in climate science.

    For other researchers to investigate the result found by one group does require some minimum level of disclosure of material, methods and results. But the enterprise that the McAuditor and others of that ilk are engaged on is significantly different from the attempt to expand knowledge that motivates most research. like the anti-vaxxer crowd, the focus is not on the findings of the research, but on the methodology of the paper. There is no apparent interest in discovering a more accurate understanding of the issue but only in discrediting the specifics of a targeted piece of research.

    It is instructive to observe that bad climate papers (Soon, Idso, Monckton) were discredited without every jot and tittle of the data and code used being available, as were the errors in the UAH results in past years. Basic principles are often sufficient to identify bad science, not every email sent by the participants during the conception, gestation and maturation of the work.

  64. Frederick,
    I’ve just read Stephan Lewandowsky’s STW post that you highlight. I agree that he is clearly not saying “be open and make everything available”. On the other hand, he’s also not saying that we shouldn’t. He seems to be suggesting that we should be discussing the complexities of this situation. I largely agree, with the proviso that my default position would be that codes and data should be available. My additional proviso is that this may not mean quite the same thing in all disciplines (as I mentioned in my earlier response to you).

  65. > [U]sing the harassment that has visited scientists in a few fields, such as climate, as a general excuse for refusing to share with people who are regarded as hostile.

    I’d rather say an argument. Saying it’s an excuse may presume it’s not a good argument. I don’t see any argument to that effect, which means the word “excuse” replaces an argument.

    ***

    > Lewandoesky has made clear, in subsequent interventions […] that he thinks […]

    If it’s not in this current intervention, it may not pertain to the current argument.

    ***

    Look. There’s a real possibility of scientists being the target of obnoxious behaviour. As a scientist one said when refuting RT’s analysis, search for “tol ackerman.”

    So to me the question is not “should scientists yadiyada” – it’s what can the institutions do to prevent that kind of harassment and make science more public.

    Nobody can afford to be against the word “open.”

  66. JCH says:

    In the US the FOIA was passed in 1967. Many skeptics seem to be saying science was purer before then. To replicate a 97%, all they had to do was get on the internet and… well, all they had to do was to actually replicate. I guess the first step was to decide whether or not it was worth their valuable time.

  67. @ willard
    1. No, it’s an excuse, not an argument, because it’s applied to cases where people who simply doubt the results or disagree with the premises, and ask to see the data. Part of the context is a case, mentioned by Lewandoesky and Bishop, involving publicly funded research on Chronic Fatigue Syndrome that was published in PLOS1 (a journal which requires an agreement to share data), but the investigators turned simple data requests into FOI matters which were then denied as vexatious. So, I say, excuse.
    2. It’s a blog post by Lewandowsky, following up on the Nature piece a couple of days later, and elaborating his position there.
    3. Certainly, harassment can be real and can be a problem. But sharing data and statistical code is not the same as having your emails hacked or being slandered. There are a lot of people who hide the details of their analysis simply because they’re afraid that their work won’t survive in the light of day (sometimes also because they’re lazy, or selfish, or their code is not very well organized). A lot of garbage ‘science’ is published and cited as a result. It creates a terrible incentive system for young researchers in many fields. I don’t like seeing the travails of Phil Jones and Michael Mann used as a shield for this sort of thing.

  68. Frederick,
    I completely agree that hiding details because they won’t stand scrutiny is utterly unacceptable and that there is a lot that is published (in all fields) that is not very good and that gain citations in a way that doesn’t reflect the quality of the work. On the other hand, I’m struggling to think of a scenario in my field where a strong claim could not be tested/replicated without gaining data/codes from those who make the claim. I may be missing something, but in almost all cases I think it would be perfectly possible to simply redo the study using completely independent material and check if it stands up to scrutiny. Of course, if the raw data came from a space telescope, or if the results were based on super-computer simulations taking many months of computer time, you may need that, but that would probably be all.

    So, maybe that biases my view a bit, in that I don’t think that whether or not you have access to all of the material from another study makes any difference if you really think that it is something worth checking. Of course, if your result is inconsistent with what the others claim you may have a lengthy battle on your hands, but if that were the case I’m not sure how having access to all their material would make my difference.

  69. dikranmarsupial says:

    I don’t think publication of code and data will have much effect on the amount of garbage science that gets published – that is a separate problem (publish/perish, not enough reliable reviewers for the number of journals etc.). Not many of us have time to spend on post-publication reviewing, and it only tends to happen on papers that may have “impact”, most just gets ignored and not cited. BTW in my experience, replicating work and showing it is wrong doesn’t have that great an effect on how often it gets cited either!

  70. ATTP: I can see that most of what I’m describing would likely not be a problem in astrophysics.

  71. John Mashey says:

    So, we need to codify rules that encourage research to be reasonably open, but shut off the nonsense, and make efficient use of tax dollars. Talking about differences in fields may illuminate that, but also, fields are different. For example, studies with human subjects are not the same as astrophysics or economics.

    In the US, David Schnare, a lawyer, paid through murky fiunding, files endless FOIAs against Mann in VA, Andrew Dessler (Texas A&M), Katharine Hayhoe (TExas Tech), Jon Overpeck and Malcolm Hughes (U of Arizona), among others … including tons of emails. FDoes nayone think he wants to replicate results, which in some cases were for decade-old work where better methods and more data have evolved?

    SInce the researchers are generally supported by public funding, that means they are forced to waste tax money, which means they do less real research. Let us recall that at least one Candain citizen has imposed serious costs on both US and UK taxpayers, for *zero* imporovement in science, and huge wastes of time.

    There is always a tradeoff between doing research/publishing results … and the amount of effort spent making software actually useful for someone else.
    Back at Bell Labs in the 1970s, there was a huge range of software engineering effort, from near-zero (individual researchers) through multi-hundred person teams doing software that would evolve over decades. We wanted science researchers to do research, not any more software engineering than needed. Of course, it helped that the internal peer review system was more ferocious than external.

  72. John Mashey: For the most part, I think we agree. An increasing number of journals require data & code as conditions of publication. Compliance is a problem, partly because the expectation isn’t yet well established. But in principle, this is what I’m talking about, not FOIs or endless emails.
    As for the effort required to make code usable by somebody else: perhaps I’m just dim, but I find that when I clean it up and comment it properly, it’s a lot more useful not only to other users but to me when I want to use part of it again. So this is not a great cost. At the other end, I’ve been in situations where one co-author couldn’t tell what the other was doing because of messy code. I’m not talking about software engineering, just high level code for statistical applications.

  73. I can see that most of what I’m describing would likely not be a problem in astrophysics.

    It’s not that there aren’t potentially problems in astrophysics, but I get the feeling that they’re different to what might be the case in other disciplines.

  74. Steven Mosher says:

    “Likewise for code this is often seen as intellectual property and I don’t believe it needs to be made freely available. If the methodology is explained in the paper then, with the right data set, it should be reproducible.”

    I seriously cant believe we are still discussing this.

    1. It doesnt matter if the code is seen as IP. The questions are this:

    A) is it in fact IP
    B) whose IP is it?
    C) who has rights to this IP.
    D) am I rationally obligated to accept the results of computations I cannot check
    E) can we make public policy based on IP that folks cannot inspect.
    F) how can you release your IP and still protect it
    G) if the method ( the code) is described with sufficient accuracy in the paper,
    then you have already given up your IP.If it is not explained with this kind of rigor
    then you really havent done science.

    2. what data is the right data?
    Its best to do this with an example. I went through a paper of Ross McKittricks.
    He cited urls for the data. kool. Problem? dead links. Solution? Ross shares
    his data AS USED. Nicely it allowed me to find errors which I sent to him.
    There are of course personal data exemptions, but we know how “some” people
    abused even that.

    3. method explanations: Don’t make me bring up the climategate example where Jones
    admitted there were steps in the actually processing that were not revealed in the paper,
    and admitted that this was the reason mcIntyre could not reproduce the research.
    The problem is no written explanation comes close to being the best explanation of the
    method. Further, typically the argument goes like this.

    R: I described it in the paper, anybody can re produced it.
    A: Actually, Point x,y and z are unclear.
    R: Read harder, I should not have to explain.
    A: On point X am I to understand it this way?
    R: Sorry my code is IP
    A; wait you said it was easy to reproduce, why do you want to protect that
    R: you’ll just have to figure it out, the words speak for themselves. You must be incompetent.

    actually skeptic Scafetta tried to play this game with Gavin Schmidt refusing to give code.

    She gets it

    https://web.stanford.edu/~vcs/talks/VictoriaStoddenUCLAstats2011.pdf

    A simple solution is for universities and other institutions to have document custodians.
    You turn your stuff over to them and they handle inquiries and requests

  75. snarkrates says:

    Replication does not involve “recreating the code”. You don’t want to recreate the code. You want to write code that does what the methodology section of the paper says. That’s it. A frigging monkey can press a button and run a computer program. A scientist needs to have sufficient skill and understanding to replicate the work from the raw materials.

    I am all for submitting the code with the publication. That is sensible. It freezes the code at the point it was when the data in the publication were developed. The data, too, should be archived. It doesn’t make sense to publish the code.

    Replication needs to be independent to avoid common errors in the analyses.

  76. snarkrates says:

    Mosher, you asked nicely. The data were provided. Where’s the problem.

  77. Steven Mosher says:

    “And it is beyond question that the [Mod: Auditor’s] requests and those of his polyps cost hundreds of hours at CRU that could have been devoted to research. Not one single publication has resulted. Not one iota of understanding of Earth’s climate has been gained from these efforts. The time of skilled researchers is a finite and critical resource when it comes to understanding how deep we are in the swamp. Do you seriously propose that we should squander that resource on nuisance FOI requests?”

    #################

    Stupid.

    CRU complied with the FOIA requests in nearly EVERY case. And the amount of time taken was BELOW the 18 hours or so proscribed as the maximum time they can spend.

  78. Steven Mosher says:

    “Mosher, you asked nicely. The data were provided. Where’s the problem.”

    you do not want me to go over the history again.

    Do you know what was requested? It wasnt even data.

  79. Steven Mosher says:

    “Replication does not involve “recreating the code”.

    Lets start with terminology.

    1. REPRODUCING
    2. REPLICATION

    Both are important.
    Each serves a different purpose

  80. The Very Reverend Jebediah Hypotenuse says:


    Don’t make me bring up the climategate example where Jones admitted there were steps in the actually processing that were not revealed in the paper, and admitted that this was the reason mcIntyre could not reproduce the research.

    Too late.

    McIntyre could not reproduce the research? What a strangely singular metric.


    A simple solution is for universities and other institutions to have document custodians.

    I believe these are commonly referred to as “librarians”.

  81. Steven,
    I agree that they’re difference and can be both important. My personal view is that our confidence in a scientific result increases the more that the general result is replicated (independently) rather than simply reproducing what someone else has done using the same code/data. That’s not to say that the latter doesn’t have value, but it seems that much of this topic is based around reproducing, rather than replicating.

    I also get the impression that many don’t seem to want to recognise that, in many cases, one could simply generate the data oneself. If someone does some kind of analysis of public blog comments, then you don’t need their actual data to do a similar study, you simply need to know how they selected the blog comments to analyse.

  82. anoilman says:

    All this presupposes essentially that there is some sort of conspiracy. But because we have multiple lines of evidence and data sets, this a very silly argument. That’s probably why this argument shows up with the flavor du jour, “I can’t access data set X… Cry for the babies!”

    It also presupposes that no one qualified is looking at the data. I’m under the impression that open means accessible if you meet the requirements. Complete Ocean data requires secret clearance. All allied navies have this, and scientists on staff looking at it.

    I recall that with Briffa’s tree rings of Hockey stick fame, that the data was not publicly available. However multiple dendropaleoclimatologists examined the squiffy tree rings (the ones removed in his paper). In this case.. they couldn’t share the data, but they could tell you were to get it, and you could jump hoops to look at it. So many qualified specialists got access, not newb bloggers and wanna be’s.

    In the end Briffa published a paper with all the tree rings in it, just to shut up the opposition. Have all the dumpy book writers apologized yet? I doubt it. Its not the complaint Du Jour I guess.

  83. John Mashey says:

    Frederick Guy:
    Let me try again.
    1) At one extreme, a person writes a program for one paper and never uses it again.
    2) At the other extreme, people are writing code that will be used over decades, with many people involved in adding and changing it.

    Anything beyond 1) involves software engineering…
    Of course documentation is good. So is reusing existing cide. So is modular structuring to make it easier to reuse code. So is writing in highest available/practical language.
    Of course, sometimes one has to have written several specific examples before one can figure out the general code that covers them, in which case over-engineerng/documenting the earlier ones may have been wasted effort. Often it is a good idea to throw away the earlier ones… and some of us had regular processes for that.

    Really good practice would include regular code reviews, and if one really expects code to be runnable by others, it will have Makefiles or equivalents, test inputs and expected results and perhaps some attention to portability issues.

    Of course, “here’s the code, as is” can still be useful, as it was for Deep Climate’s discovery of the 100:1 cherrypick in McIntyre’s code.

    But still, the goal ought to be: gather examples into classes of good, marginal and counterproductive behavior and then codify rules to handle them properly.

  84. Hyperactive Hydrologist says:

    The CRU results have been replicated by NASA and NOAA and more recently by BEST. In addition to independent studies looking at the cryoshere, sea level rise, growing seasons etc. which confirm that the the globe was warming. So I don’t really see what the fuss is about.

    I agree that for certain types of studies code should be release particularly if complex models are built, for example a physical hydrological model. However, if the paper is doing statistical test on data then as long as the methods are described the code for doing the tests does not necessarily need to be made available. My wife has a colleague who is a statistician (and engineer) apparently when he reviews a paper he will normally write his own code to test the results of the paper. Which I guess in an ideal world is how all papers should be reviewed.

    You also have to remember that most scientists may never have had formal training in computer programming and most probably learn it during their PhDs. Therefore, their code may not make much sense to someone else

  85. snarkrates says:

    Mosher, can you point to a single result that has been significantly changed as a result of all the “auditing”?

  86. John Mashey says:

    ” most scientists may never have had formal training in computer programming ”
    This is rather age- and field-dependent. Many younger scientists would have taken at least one formal programming course,
    However, programming, computer science and software engineering are not the same, even if they overlap. and it is far less likely the scientists would have taken formal software engineering classes, although some scientists actually get fairly good at it, with experience.
    Long ago (1970s), when I was teaching CMPSC, I used to jam in software engneering issues into OS and language courses as hints for the real world.

    I would do things like give them specs for a project, then warn them I’d make some changes a few days before it was due. An extended version was to have 3-person teams write some code together, then pass it to a different team, who had to modify it to do something slightly different, and rate how hard it was.

    Of course, humans learn well by example. In the early days of UNIX, all one got was:
    a) A paper on UNIX
    b) A 20-page C language description
    c) Man pages
    d) All the source code … probably the most valuable, since it was very good.

  87. > No, it’s an excuse, not an argument, because it’s applied to cases where people who simply doubt the results or disagree with the premises, and ask to see the data.

    The argument applies to all the cases: it’s a counterfactual.

    What if I told you the current setting has been shown to lead to harassment?

    ***

    > I don’t like seeing the travails of Phil Jones and Michael Mann used as a shield for this sort of thing.

    I understand the sentiment. However, your disliking of it looks like an excuse for refusing to consider the authors’ argument at face value. The argument only leads where you already concede: “Certainly, harassment can be real and can be a problem.” We’re all into violent agreement on that point, just like we’re all into violent agreement regarding making science public.

    The thorny detail is how. We need more specific guidelines as to which products should come with published research. Until then, we’ll get the same rounds of tolling in the dark.

    To paraphrase the Auditor, I’d turn this to lawyers.

    ***

    > There are a lot of people who hide the details of their analysis simply because they’re afraid that their work won’t survive in the light of day (sometimes also because they’re lazy, or selfish, or their code is not very well organized).

    The inverse is also true. There are people whose research products are all in the open, and nobody cares to look until it’s years late. I have the Rogoff episode in mind. How come the economics’ community, your community Frederick, failed for so long to find the Gremlins in an Excel spreadsheet from a top economist? You can tell in confidence: I won’t dismiss it as an excuse.

    From greater auditing powers come greater responsibility.

    Speaking of which:

  88. > [C]an you point to a single result that has been significantly changed as a result of all the “auditing”?

    This begs for the thread to become an hockey stick food fight, snarkrates.

    Try Tiljander:

    http://amac1.blogspot.com/

    I predict a round of what “single result” means.

    Maybe it’s a vocabulary thing.

  89. snarkrates says:

    Willard, I was thinking of a peer-reviewed journal article that didn’t contain a 100:1 cherrypick, for instance. Or the retraction of an important peer-reviewed journal article that was cited by the IPCC?

  90. The inverse is also true. There are people whose research products are all in the open, and nobody cares to look until it’s years late.

    I think there’s also another inverse to this. There are some who appear to suggest that because they’ve made all their research products public that criticism of their work requires that you find a mistake in one of those research products.

  91. > I was thinking of a peer-reviewed journal article that didn’t contain a 100:1 cherrypick […]

    That cherrypick mostly affect the suggested presentation, snarkrates. Here:

    How much of the HS shape of the PC’s that they showed was due to the MBH selection process (and there is some), and how much to the artificial selection from the top 1% of sorted HS shapes? To this end, I tried running the same algorithm with the same red noise, but using correct centering.

    It’s a fairly long post, but you can peek at the conclusion.

    http://moyhu.blogspot.com/2011/06/effect-of-selection-in-wegman-report.html

    Moreover, this thought does not indicate anything about any “single result,” and is unresponsive both to the case I submitted to you (Tiljander) and to the request I made. To repeat, my request is that this thread does not turn into an hockey food fight.

    Besides, your righteousness is misguided, for even if we assume audits never changed a single result, that doesn’t mean the response to them was optimal and that all the principles on which they rest should be invalidated. I have in mind the principles of making science public and of releasing due diligence products.

    The main question of this thread is to decide which products to make public, and the authors’ point is to remind that institutions should work to insure that scientists can concentrate on improving scientific crap.

    All science is wrong, but some of it is still useful.

  92. snarkrates says:

    Willard,
    I’m well aware that all science is wrong. Indeed, that is its chief selling point. It will be improved upon by subsequent researchers. And it will be published in the peer-reviewed literature. That is precisely why it is useful.

    The AUDITOR thinks he’s found a solution, but it is a solution to a problem that doesn’t exist. Climate science is working just fine. It doesn’t need to be audited–particularly by someone who can’t seem to publish his findings. Was the response of the community to the gadfly optimal. Of course not. However, that is the great thing about science–it also works when people don’t behave as angels.

    Yes, all science is wrong, but some of it is still useful.

    However, all audits are also wrong, and I’ve yet to see one that proved useful.
    The whole thing reminds me of the story about Roy Schwitters when he was lab director for the Superconducting Supercollider project. Schwitters was continually plagued by “auditors” from the offices of Congress critters. He referred to them as “the revenge of the C students.”

  93. Nice comment, Snarkrates.

    Here would be an audit that proved useful:

    In one of life’s little ironies, last Friday’s disappointing G.D.P. figures, which reflected a sharp fall in government spending, appeared on the same day that the economists Carmen Reinhart and Kenneth Rogoff published an Op-Ed in the Times defending their famous (now infamous) research that conservative politicians around the world had seized upon to justify penny-pinching policies. Addressing a new paper by three lesser lights of their profession from the University of Massachusetts, Amherst, which uncovered data omissions, questionable methods of weighting, and elementary coding errors in Reinhart and Rogoff’s original work, and which went around the world like a viral video, the Harvard duo dismissed the entire brouhaha as “academic kerfuffle” that hadn’t vitiated their main points.

    Really? Even somebody living in a bubble stretching over Harvard Yard would have difficulty believing that. For all of the illuminating work Reinhart and Rogoff have done on the history of financial crises and their aftermaths, including their popular 2011 book “This Time Is Different: Eight Centuries of Financial Folly,” their most influential claim was that rising levels of government debt are associated with much weaker rates of economic growth, indeed negative ones. In undermining this claim, the attack from Amherst has done enormous damage to Reinhart and Rogoff’s credibility, and to the intellectual underpinnings of the austerity policies with which they are associated. In addition, it has created another huge embarrassment for an economics profession that was still suffering from the fallout of the financial crisis and the laissez-faire policies that preceded it. After this new fiasco, how seriously should we take any economist’s policy prescriptions, especially ones that are seized upon by politicians with agendas of their own?

    http://www.newyorker.com/news/john-cassidy/the-reinhart-and-rogoff-controversy-a-summing-up

    These Gremlins are rather important, don’t you think?

    A comment from Frederick on this might be nice too.

    ***

    There are many cases like that, in just about all the scientific fields.

  94. angech says:

    Research Integrity.
    Stephan Lewandowsky and Dorothy Bishop have published a comment in Nature about Research Integrity, arguing that we shouldn’t let transparency damage science.
    It’s a complex issue?
    Defending this comment and commentator is a complex ethical issue for some.
    Good to see some people trying to “defend the indefensible”.
    It is walking on eggshells though.
    Arguments sidetrack to
    1. some people abusing freedom of information legislation
    This ignores the fact that the legislation is there in the first place. Fix the legislation.
    Most of the time all of us agree it is a good thing exactly because transparency is seen as an extremely important principle.
    2. Scientists are subject to threats and abuse.
    Not really relevant to the question of transparency of scientific data.
    Negative points for using this one.
    3 Code/Data can be intellectual property.
    One can keep all the code and data one wants secret.
    By a Government act.
    By not engaging with people.
    Once you engage and communicate people are entitled to ask for your data and code where relevant. Volkswagen, Vioxx, Thalidomide, Tree rings.
    You can refuse to give it.
    You can be annoyed at cranks asking like “what are the risks of this operation” or “what are the side effects of this drug”.
    After all you “know” how the code works and they don’t.
    4. Stephan is a good guy, he makes McIntyre look like a bad guy, lets try to support his comment.
    Is this called Cognitive dissonance?

    Nature Magazine made a decision.
    ” a number of the usual suspects were most put out when some of their comments were later deleted”.
    So they voted against transparency.
    It is nice to note that almost all commentators here acknowledge that transparency is vital for science and ethics to progress.
    Tol, Mosher, Snarkrates, ATTP et al , well done.

  95. ” a number of the usual suspects were most put out when some of their comments were later deleted”.
    So they voted against transparency.

    Or they decided that it might be better to delete comments that were potentially libelous? Plus, I was really just having a dig at those who were whining. Just seems a pretty standard tactic. Write something stupid. Get it deleted. Whine.

  96. Marco says:

    I’ll add my 2 cents worth here on the issue of ‘data’ sharing, an issue that I know can be quite tricky in my field and associated fields. People will know it is not a black-and-white picture, as much as we often like it to be. Anyway, here goes, using a few examples.

    1. Some data on human subjects is sensitive in many ways. Laws and regulations can differ by country, but in general the data holder cannot just release all the data he holds, unless the data cannot be used to do analyses that were not covered by the informed consent forms signed by the subjects. I personally know of a case where a research group wanted to get two datasets from several companies that was published, just not linked together (that is, dataset 1 and dataset 2 with identification of the individual subjects in datasets 1 and 2). Linking the datasets would have allowed the research group to identify whether certain genetic markers (dataset 1) were predictive of a certain outcome (dataset 2). Access denied, because that type of analysis was not approved in the informed consent form.
    Note that such situations occur quite often: in various clinical studies data is obtained and analysed where the original data holder may in principle be capable of linking information, but which is not allowed. It then gets pretty tricky if someone else demands the data, and you need to find a way to give all the data while disallowing the other to link information (subject 1, age, height, weight, tests 1/2/3/4, etc).

    A real-life example of the issues that may arise when laws and regulations clash was the Gothenburg Study, where researchers were forced to share data, despite the subjects not allowing them to do so – international regulations said they could not share the data either, but Swedish law said they should (see https://en.wikipedia.org/wiki/The_Gothenburg_Study_of_Children_with_DAMP)

    2. If you work with industry, as I sometimes do, it is possible they provide you with data/information, which are governed by various agreements. What do we do if others demand that data? I would violate the agreement and thus possibly get sued, but not giving the data may mean that I get accused of scientific misconduct. One option is to then not use that data…which does not help science either (and can also get me accused of scientific misconduct). This is not much unlike what CRU had to deal with when data was demanded, and they in fact had to release data despite the original data owner not allowing them to do so (Trinidad & Tobago – Poland also refused but was not covered by the FOIA and therefore its data was not released). I am certain that the companies *I* work with would have sued me, and would also have won the case.

    3. A cdesign proponentsist (sorry, can’t resist), Schlafly, demanded loads of data and even bacteria from an evolutionary biologist (Lenski). Most journals now demand scientists make certain materials freely available, so in principle Lenski should have given the material. But these are bacteria: the receiver must handle those properly. Whose task is it to police this? Lenski refused to send Schlafly the bacteria, in part because Schlafly could not show he had the appropriate skills to handle such material. Is this refusal of ‘data’ sharing good or bad? And Lenski had different strains of bacteria at different stages of evolutionary development. Should he share them all with competent scientists?

    4. Not so long ago, a Dutch virologist wanted to publish a paper about the H5N1 virus and aerosolic delivery. The Dutch government required him to get an export license to publish his papers (that is, just the information), because it was a potential biological weapon. Even though that was ultimately granted, I doubt he would be allowed to send out any *materials*, even if the journal demanded it, or at best to a very selected few.

  97. Marco,
    Thanks. I think there are similar examples we could all provide.

    People will know it is not a black-and-white picture, as much as we often like it to be.

    My impression is that some people don’t know this, or don’t want to accept this. The goal of open data is good. Actually achieving it is not so simple, and it may well be that there are different views as to what this actually means. My own view is that the basic position is that others should be able to redo the analysis in some way so as to establish whether or not the result is reasonable (I’m trying to avoid using “right”, “wrong”, “correct”, etc). This doesn’t necessarily require having all of the other party’s data/codes. This isn’t, however, an argument against people making more than is necessary available, but is an argument against it being some kind of absolute requirement.

  98. Roger Jones says:

    This whole discussion about IP and code completely fails to confront the disjunct between commercial IP and its protection under proposals like the Trans-Pacific Partnership, and what is produced in academia.

    What is the point in spending years developing a theory and proposition only to let it all go straight away to whoever can exploit it for their gain? When commercial knowledge has all the protection you like. Both models also treat the originators like crap, whereas the moral rights of the creator outside science are something quite different. The Mona Lisa was created by one artist and everyone knows it. The gun for hire (commercial scientist) and the public scientist both lose out.

    As a scientific reviewer, I don’t need code. I do need transparency about method and an idea of whether it is reproducible by other researchers, otherwise it cannot be assessed scientifically. I desperately need theory, which is absent from many papers these days – assumptions seem to be fine.

    Audits are crap and the whole science model is moving towards minute steps forward of no consequence. Discovery will not be tolerated.

    Btw, it is still possible to curb big pharma and other sharks within this broad morass, but the issue needs to be tackled in toto, not in bits.

  99. As a scientific reviewer, I don’t need code. I do need transparency about method and an idea of whether it is reproducible by other researchers, otherwise it cannot be assessed scientifically. I desperately need theory, which is absent from many papers these days – assumptions seem to be fine.

    Yes, I agree and that’s not unlike what I was trying to get at here. What’s needed is enough information so as to actually go about checking their results (method, assumptions, initial conditions, …) I don’t necessarily need their actual code or data.

    I made a similar point on Climate Etc. It didn’t really go well. I wasn’t expecting it to 🙂

  100. snarkrates says:

    Willard,
    I do not think you can equate the situation in economics to that in climate science. Even economists acknowledge that their discipline is just barely a science sometime (see the article in this weekends The Economist magazine). Sharing of data has traditionally been a matter of professional courtesy–frankly, it’s in the interests of the original author to do so. It is not, however, reasonable to expect, for example, the LIGO collaboration to share its results with every imbecile who wants to disprove relativity because it throws a damper on their dreams of faster-than-light travel.

    I still draw the line at sharing code. Code should be mercilessly “audited” internal to a research group prior to publication–after all, that is who has the greatest interest in the correctness of the publication. I’m all for freezing the code and archiving it, and even doing so with the journal. Sharing code outside–unless the code has been thoroughly vetted and standardized–is a recipe for error propagation. Replication should be independent to the greatest extent possible.

  101. tlsmith says:

    In Mathematics research, a critical part of the process is exactly for lots of people to “simply check what others have done so as to try and find mistakes”.
    It can take many months of painstaking work. Looking for errors is hard work and not very glamorous at all.

    It is a useful and necessary process for example in pure mathematics, but I would imagine in statisitical analysis of climate data it is also necessary. If there is an important mistake then it means the result is likely to be wrong, or at least uncertain and more research needs to be done. Using the same method, containing the error, merely reproduces the error. Trivial errors that don’t have any effect on the results or conclusion are still nice to know about for the next time a similar study is performed.

    Sorry but I don’t see how merely repoducing other people’s mistakes can improve our scientific understanding, as opposed to looking deeply enough into the method to understand if there are any errors present. I guess what I am trying to say is that replication ultimately is about simply checking what others have done so as to try and find mistakes.

  102. Sorry but I don’t see how merely repoducing other people’s mistakes can improve our scientific understanding

    Because if we’re trying to understand a physical system, the evolution of which is largely described by a set of equations that are well understood and accepted, then reproducing results provides confidence that the understanding is correct. You don’t need to necessarily have another person’s code or data to be able to assess if what they’ve presented is credible.

    as opposed to looking deeply enough into the method to understand if there are any errors present.

    Except, if others can show that alternative (or the same) analyses produce a completely inconsistent answer, then you don’t necessarily need to know what the mistake was to be fairly sure that it’s wrong. You don’t need to convince the original authors that they’ve made a mistake; you simply need to present an analysis that suggests that their result really can’t be right.

    I guess what I am trying to say is that replication ultimately is about simply checking what others have done so as to try and find mistakes.

    And I’m suggesting that this isn’t necessarily a particularly good use of resources. If many people carry out research independently (or semi-independently) to try and understand a system, and get results that are consistent, that provides confidence that our overall understanding is reasonable. If someone’s results are an outlier, you could either check their code/data carefully to find an error, or other groups could simply illustrates that it is inconsistent with their analysis of the same system.

    I’m, however, not specifically arguing against careful replication of specific results (using their data/code) but suggesting that it isn’t always required and may not – in some cases – be a particularly efficient way to develop understanding.

  103. In Mathematics research, a critical part of the process is exactly for lots of people to “simply check what others have done so as to try and find mistakes”.
    It can take many months of painstaking work. Looking for errors is hard work and not very glamorous at all.

    I should add that I’m certainly not suggesting that there is never a scenario where careful checking of what others has done isn’t the right way to proceed. However, that this might be true in some cases, does not make it true in all.

  104. MartinM says:

    In Mathematics research, a critical part of the process is exactly for lots of people to “simply check what others have done so as to try and find mistakes”.

    I don’t see how that’s comparable, really. Maths is deductive; a proof is either right or wrong. Most scientific research will inevitably involve assumptions, judgements, and methodological choices that aren’t objectively right or wrong, or at least not demonstrably so. That’s precisely why it’s so important for others to approach the same problem in a different way, with their own set of assumptions, judgements, and choices. You can always point at any study and identify places where the authors’ method could lead them astray. That’s McIntyre’s favourite (only?) trick. But do the issues identified actually matter? That’s another question entirely, and one which is ultimately best addressed by showing that the results are robust over a wide range of methodological choices.

  105. Raff says:

    I read above about the harassment of Jones using FOI requests for 5 random sites. Can someone point me to a good description of the overall harassment of Jones/CRU? Thanks.

  106. lerpo says:

    The problem with replicating results rather than auditing (if your goal is to ensure that “we don’t know nuffin”) is that you end up contributing to our understanding. Take the case of the O’Donnell et al. response to Steig 2011. McIntyre and O’Donnell were furious to find that they’d been tricked into presenting their most likely result.

    The narrative changed from “O’Donnell has a different method therefor Steig is wrong” to “O’Donnell et al. confirm the findings that the entire continent was warming, on average, prior to early 1980s, that areas with little sea ice are always areas of surface warming in the Antarctic, etc, etc. The only point of disagreement is in winter, in the earlier part of the record. Further research is warranted where disagreement exists.” – http://www.realclimate.org/index.php/archives/2011/02/west-antarctica-still-warming-2/

  107. tlsmith says:

    Ideally of course, the reviewers will have done the checking for mistakes and decide whether the issues identified actually matter.

    Btw I followed the link to Dorothy Bishop’s blog, it is well worth a read imo, as well as their Nature paper.

    Actually her most recent blog post talks about statistical errors that appear to be very prevalent in medical research. I think someone with statistical knowledge (like McIntyre???) probably does have something to offer in pointing out where a particularly statistical methodology could lead people astray, coz none of us knows everything, and some of us less than others 🙂

  108. I think someone with statistical knowledge (like McIntyre???) probably does have something to offer in pointing out where a particularly statistical methodology could lead people astray, coz none of us knows everything, and some of us less than others

    I’m sure that there are many would could do this, including some who are probably already doing this.

  109. Michael Lloyd says:

    @Raff

    See the Sir Muir Russell report, p 86 onwards.

    http://www.cce-review.org/pdf/FINAL%20REPORT.pdf

  110. dikranmarsupial says:

    I think one point is often overlooked, which is that there is a cost associated with making code available, which is that it needs to be engineered sufficiently that the researcher won’t be unduly taxed by support requests from users (I speak from experience) and that takes time and effort. Also research is not well funded, such that excellent projects don’t get funding, simply because the money runs out before you get far enough down the ranked list of projects at each panel. This means there is ever increasing pressure on researchers to be more and more productive with less and less funding, and spending time making code available is not generally rewarded. At the end of the day, if society wants source code, then society needs to meet the costs involved. I really want to provide code for my work, because I want other researchers to actually use my methods to solve problems, and I want to make it as easy as possible for them to do so. However, it is not a cost free exercise (unless you are willing to give pretty much a total cold shoulder to anyone asking for support).

    For the project I am working on at the moment (not climate related), the code is all pretty straightforward (using existing machine learning libraries), but takes a few hundred processor years to generate the results, and the intermediate data takes up about 3Tb of disc space. So I could release all the software for the project, but what would be the point given that nobody is likely to run it and check my results? As ATTP suggests someone checking it by reproducing the results using their machine learning toolbox would be a far more useful exercise. I suspect the same goes for GCMs, which also are computationally very expensive and generate a lot of data, just where are you going to find the resources to re-run the experiments, which isn’t needed for new research?

  111. > I do not think you can equate the situation in economics to that in climate science.

    You asked for a counterexample, Snarkrates, and I don’t think you can’t special plead your way out of it. There are legions of counterexamples, and “but it’s not science” won’t cut it either.

    This is a better set of points, which applies not to replication, but to reproduciblity:

    I still draw the line at sharing code. Code should be mercilessly “audited” internal to a research group prior to publication–after all, that is who has the greatest interest in the correctness of the publication. I’m all for freezing the code and archiving it, and even doing so with the journal. Sharing code outside–unless the code has been thoroughly vetted and standardized–is a recipe for error propagation.

    There would be a case for internal audits. Once the code (and data) gets to the journal, however, authors should not have any more say to it. In return, the problem gets shifted to the journals, and authors should be freed from having to deal personally with any of this.

    It could be in principle possible to audit code without having its source. We do that all the time with checksums. The more scientists will use standardized libraries, the less we’ll need their own code.

    Error propagation is indeed a problem. This is why God created versioning systems. Which leads us to a fork: either scientists run packages and plugins and create turnkey code, or scientists go rogue and face their programming amateurism. If they are ashamed of their programming styles, let them hire editors.

    ***

    None of this dispute the principles of reproducibility and openness. But as Marco’s examples show, the Devil lies in the details. So if scientists want auditors out of their sight, they better start pressuring the institutions that are supposed to protect them.

    The first institution that needs to change is the one that is responsible for the lichurchur.

  112. snarkrates says:

    Willard,
    OK, let me get this straight. In your eyes, saying that economics and physics have different degrees of objectivity and politics is special pleading? Really?

    You do know that the U. of Chicago Economics Department’s motto is “That’s fine in practice, but how does it work in theory?”

  113. > In your eyes, saying that economics and physics have different degrees of objectivity and politics is special pleading? Really?

    As far as it follows a “show me an audit” challenge, you bet it is.

    Since you picked physics:

    When science writer Vito Tartamella noticed a physics paper co-authored by Stronzo Bestiale (which means “total asshole” in Italian) he did what anyone who’s written a book on surnames would do: He looked it up in the phonebook.

    What he found was a lot more complicated than a funny name.

    It turns out Stronzo Bestiale doesn’t exist.

    In 1987, Lawrence Livermore National Lab physicist William G. Hoover had a paper on molecular dynamics rejected by two journals: Physical Review Letters and the Journal of Statistical Physics. So he added Stronzo Bestiale to the list of co-authors, changed the name, and resubmitted the paper. The Journal of Statistical Physics accepted it.

    http://retractionwatch.com/2014/10/09/should-papers-be-retracted-if-one-of-the-authors-is-a-total-asshole/

    27 years – does anyone have more?

  114. John Mashey says:

    dikranmarsupial is quite right.

    Again, back in the era when Bell Labs was 25,000 people in R&D:
    1) We used to say “monoipoly jmoney is nice”.

    2) We were in one company, which minimized barriers to sharing, although there were divisions that necessarily had more restrictive rules.
    People wrote masses of internal Technical Memoranda, each with a To: list and set of topic codes, managed through the (awesomely good) internal library. Each person had a profile, and if subject codes matched or if you were on the To list, you got a cover sheet with abstract. If you wanted to see the full memo, you circled your name/office code and threw it in your mailbox, and a few days later the memo would appear there.

    3) Code was traded around internally all the time.
    People may have heard of UNIX. 🙂 In the early 1970s, a “release” was that you drove to Murray Hill, got on to Ken and Dennis’ PDP-11/45, and copied current state of source and binaries, which came without warranties, support (these guys were paid to do research), version control or any of that … It just happened to be really, really good code by a group of people who were not only fine computer scientists, but also great programmers (which is not always true).

    Of course, they *wanted* to see their code get used, just as John Chambers wanted S to get used, but such are different from programs written to do research papers.
    Physics or chemistry researchers at MH didn’t care so much, and even with monopoly money, nobody was going to pay for serious software engineering. If one of those groups got higher budget, they’d hire another PhD researcher.

    If it turned out some code was of broader use, then maybe it would get taken over by some support group, which of course also happened with UNIX, since some people needed to use it in production, hence regression tests, upward compatibility, etc … not of interest to computing research.

    Many of the “genes” of modern software engineering tools and methods got developed there during that time. (For example, if you trace ancestry of modern version control systems, you end up back at Marc Rochkind’/Al Glasser’s Source Code Control System, with a lot of discussions between my office and the one next door. The projects within our thousand-programmer organization ended up using SCCS. Individual researchers didn’t.)

    4) One more time: the appropriate level of software engineering.varies from minimal to vast, and even really rich organizations allocate different amounts of resources.

  115. snarkrates says:

    Willard,
    Was Hoover’s paper incorrect? I really don’t care if one of the authors is a “total asshole”. I care whether what said asshole publishes is correct and useful. They can name their genitalia and put it on as a co-author for all I care.

    You have yet to show where an audit would have made a difference. Perhaps in medicine, but most of the data are considered proprietary in that field in any case, and so would not be available for an auditor to peruse.

    Look, if a result is if sufficient interest for others to replicate it or even to use it for their own publications, it will be found out if it is wrong. If it isn’t of sufficient interest for others to use it, why do we care?

  116. > I really don’t care if one of the authors is a “total asshole”.

    In return, I don’t really care whether you care or not, snarkrates. The more you drag your special pleading, the more examples I can add to this thread. The more examples I can add to this thread, the more I can substantiate the idea that the concept of auditing is rock solid, and applies to just about any endeavour that relies on intersubjectivity:

    Intersubjectivity is a term used in philosophy, psychology, sociology, and anthropology to represent the psychological relation between people. It is usually used in contrast to solipsistic individual experience, emphasizing our inherently social being.

    https://en.wikipedia.org/wiki/Intersubjectivity

    In that case, the audit simply consisted of picking up a phonebook. In other cases, it consists in checking citations. Here’s a latest one:

    > The phrase is by Prof Katzav, from whom I got the idea.

    This shows you can only pretend to having read Katsav’s article, Editor.

    His very first mention of severe testing cites pages 20 and 62. Do you really think Kat[z]av would have written “only if systematic attempts to falsify or severely test the system are being carried out [Popper, 2005, pp. 20, 62]” if it wasn’t in Popper?

    The first page is a dud, even if it announces something promising in section 20. The second page is one of that section 20, and is only relevant because of this:

    How degrees of falsifiability are to be estimated will be explained in sections 31 to 40.

    You’ll never guess to which chapter these sections refer [i.e. the one I mentioned earlier to the Editor].

    There’s the word “degrees” in its title.

    While Kat[z]av has been mostly unhelpful, there are more than 30 occurences of “severe” in that book. You can search yourself: I gave you a PDF of the same edition Kat[z]av cites.

    You can’t even pretend having read Popper anymore.

    ***

    Do constant failures to pay due diligence to one’s own citations exemplify why the policy debate has gridlocked?

    https://judithcurry.com/2016/01/28/insights-from-karl-popper-how-to-open-the-deadlocked-climate-debate/#comment-761337

    This simple audit shows that Katzav failed to pay due diligence to his lip service, and that the Editor misled his readers in pretending to have read both Katzav and Popper. Editors do audits like that all the time, contrary to what appears to be the norm in the lichurchur.

    ***

    Please assure me you never play Poker will real money, Snarkrates.

  117. snarkrates says:

    Checking references is part of peer review. One does not need an “audit”. It is already part of the scientific method. That some reviewers fail do do a thorough review is a known issue, and yes, crap does get published. However, it will be short-lived unless it is of no use to the community, because some researcher somewhere will figure out that there is an issue.

    Science has been working like this for 300 years, Willard, and the results are pretty spectacular, would you not agree?

    The publication record of the auditors… not so much.

  118. dikranmarsupial says:

    Just for clarification, what exactly is the difference between “auditing” and the sort of post-publication peer review that already goes on in science (e.g. comments papers, which usually require reproducing the work in the paper being commented on)?

  119. mt says:

    I – KEEP YOUR LAWYERS OUT OF MY NOTEBOOKS

    The question is at what point (as a publicly funded scientist, which these days I’m not, but please bear with my scenario) … At what point does my monkeying around in Matlab at State U become something that a Mr J Random Harasser is entitled to look at.

    The FOIA principle seems to be that if I did it on state equipment, it is everyone’s to examine and mock the instant I did it. Certainly the instant I use email or a telephone to discuss it.

    People who can’t imagine the chilling effect that has on scientific exploration seem to lack any imagination whatsoever. Making every little mistake potentially career-threatening may be how politics works these days but that’s no reason to export it to science, where it’s not the bad moves but the good ones that matter.

    ===

    II – EXPECT SCIENTISTS WRITING CODE TO LEARN HOW TO TEST AND DEMAND PLATFORMS THAT ALLOW IT

    On the other hand, if my computational experiment holds water and leads to me making an assertion in a formal publication, it costs me nothing to make my Matlab routine public, and indeed the entire review process (or whatever we replace it with) ought to insist upon it.

    People here have been joking that it shows nothing to run the same code on the same data and get the same answer.

    Here I tediously I insist upon an important technical point that is somewhat tangential to the point of this discussion. It does show something, because in scientific coding this anticipated triviality is not the usual case.

    People using modern test-driven software development techniques in commercial enterprises take reproducibility at this level so much for granted that it’s not really explicit. But test-driven development in scientific software is dramatically harder. This is because the same code will not pass the same tests every time, even if it is logically correct.

    This is because science is constrained to using floating point numbers, and so minor compiler optimizations can change results without changing whether the results are valid.

    I can go on at length about how and when this sort of thing bites the practitioner in practice.

    This is quite unnecessary. Compiler writers are rewarded for making things go fast. If they can’t change order of operations, they don’t get to do their thing. But they’re doing more harm than help.

    I’d rather pay twice as much for a computation I can repeat. But that’s because I’m the rare person who knows something about software engineering AND something about high performance scientific software. Most scientists just think computers are unreliable and difficult.

    Let me put it this way – if you run the same code on the same data and you FAIL to get the same answer, does this cause you concern? I suggest that it should.

    III – PUBLISH WHEN YOU PUBLISH

    Which brings us back to the question of what is the public’s business and what is my own. Once I publish a figure, it costs me nothing to make the data and a script which generates the figure available for download. It costs me nothing, that is, except possibly the exclusive access to my data.

    The data should be mine and mine alone until I publish research based upon it. At that point it should be public, and everything between the raw data and the final figure should be public as well, with a pointer to the FTP site right in the paper.

    I realize there are cases where data is partly private or proprietary or even classified. The rule has to be more complex in such cases, admittedly.

    But those cases are not common in climate science, though, nor in many other disciplines.

    Where this doesn’t apply the rule I’d propose is simple. My correspondence (except regarding funding, which is a legitimate FOIA target) is my own. My notebooks, my scratch disk, my whiteboard, my chats with colleagues, my clever witticisms, and that stupid thing I thought was funny at the time that I wish I hadn’t said, these are all mine. The data I collect are mine until I publish.

    When I publish, every graphic or table that I publish should be reproducible to the last pixel. And the scripts to do that ought to be public.

    IV – FINAL GRUMBLE

    Compilers and platforms which don’t support backward bit-for-bit compatibility are unsuitable for this proposed standard and thus unsuitable for the progress of computational science. Unfortunately, those are the ones we’ve got. So even though you may support my rule, you should not expect me to comply with it overnight.

  120. MT,

    This is because science is constrained to using floating point numbers, and so minor compiler optimizations can change results without changing whether the results are valid.

    I did wonder if this is what you meant. So, yes, I agree that this is a valid issue.

    When I publish, every graphic or table that I publish should be reproducible to the last pixel. And the scripts to do that ought to be public.

    Interesting. As you can probably tell, I tend to disagree. Not because I think it would be wrong, but more because I can’t really see the point. Maybe it is a discipline dependent issue. I can’t see my colleagues really seeing this as all that crucial. On the other hand, there are various code and routines that do include all of this functionality, so maybe it is slowly happening. My own personal issue is that I dislike the idea of running some simulation and then simply running the output through some routine that I didn’t write. So, I tend to write my own analysis routines. They tend, however, to be pretty short and so I expect anyone moderately competent could redo it easily.

    I’ve also never had trouble redoing figures from other papers that I’m trying to reproduce, so maybe I just don’t see it as much of an issue. That may not be true for all disciplines.

  121. Kevin O'Neill says:

    There probably needs to be a distinction drawn between using observations and making observations.

    The first is typically data analysis and might allow for exact replication. The second is rarely going to allow for exact replication; one can only expect to reproduce the results within the combined uncertainties of the initial test and any subsequent test – regardless if this is by the same or different set of researchers using the same or different equipment.

    Statistical analysis can typically be replicated; measurements can typically only be reproduced consistent with previous experiments.

    ‘Auditing’ in the general sense is obviously necessary. It’s already part of the scientific process. In a more limited sense, ‘auditing’ often degenerates into an ideologically motivated challenge to a particular result based on subjective choices and/or the highlighting of minor or inconsequential errors.

    If McIntyre and McKitrick had produced an alternative to MBH98 using different methods and justified those methods, McIntyre wouldn’t be referred to as ‘the auditor.’ The Nitpicker might have been more appropriate. There is a distinction between nitpicking and auditing; The line was most readily drawn by Justice Potter Stewart in Jacobellis V. Ohio

  122. Raff says:

    Michael Lloyd, thanks for the link to the Sir Muir Russell report. I think I asked the wrong question. The harassment mentioned in the report is from 2009ish and I had the impression that CRU and Jones and maybe other researchers had been (or considered they had been) harassed for many years before that. Is this documented somewhere? Thanks again.

  123. > Checking references is part of peer review. One does not need an “audit”.

    Peer review is a a kind of audit. People who are supposed to know what you’re doing try to check, validate or assess what you did. This includes all kinds of acts, among them checking citations. At least in principle. As you can see in the Kazlav case, checking citation not something peers always do. In fact, it’s quite obvious they seldom do it. So you can’t even argue that peer review is enough. I could argue it’s not even necessary anymore, but I don’t need to for what follows.

    This point counters whatever card you might still have in your hand besides this True Scotsman, snarkrates.

    To return for a moment to your appeal to physics envy:

    There is strong evidence that the BIPM’s kilogram, the one mass in the world that cannot be allowed to vary, is varying.

    Philosophically speaking, such a statement is nonsense: it is impossible for the kilogram to be anything other than a kilogram. As Dr Richard Davis, who has worked in the laboratory’s mass department for much of his career, puts it, “Technically, if you cut it in half, it would still be a kilo.” In those circumstances everything else in the world would instantly weigh twice as much—at least when measured in kilograms.

    No one is cutting the kilogram in half. Only a handful of people have been allowed to see it, and fewer still to touch it. Even so, as experimental methods get more precise, and physicists work at ever smaller scales, the vanishingly slight variations in the kilogram have become increasingly inconvenient.

    At a meeting in Versailles in November, scientists will discuss whether the technology now exists to switch to a mass standard that cannot ever change. Is it time to redefine the kilogram?

    http://www.intelligentlifemagazine.com/content/ideas/tom-whipple/weight-almost-over

    I picked this example to show an important point: when a bunch of people have some skin in a common business, loose cannons get out of the way and the caravan moves on.

    Just like language, science is a social art.

  124. > Just for clarification, what exactly is the difference between “auditing” and the sort of post-publication peer review that already goes on in science […]?

    You’re doing one of the two right as we speak.

  125. snarkrates says:

    MT, I agree that the results of the analysis should be reproducible both on the same platform and across platforms, but it would be surprising to me if this wasn’t something a researcher checked long before submitting the publication. If the researcher has done this, would you agree that there is no value in having an “auditor” do it?

    Also, while it may cost you nothing to make your code freely available, and indeed, sometimes this should be done. Efforts to replicate results of one’s research should be done independently–and that includes the replicators writing their own code. Further, in making code available to the general public you run the risk of someone who isn’t familiar with the code and its limitations applying it well beyond its validity and getting “confusing” results.

    Dikran Marsupial, I have defined “auditing” as it has occurred to (not in) climate science–a novice with no understanding of the science, but a reasonable understanding of the statistics demands you turn over everything that had any connection with the research while accusing your of scientific fraud. Said auditor the proceeds to try to “reproduce” the result, again with copious accusations of fraud and incompetence, blames you for his failures, ultimately gets close enough, pronounces your research inferior and never publishes any of his efforts in the peer-reviewed literature.

    Science, on the other hand, involves multiple stages of review of methods and results–internal to one’s research group, then one’s research institution, a peer-review by colleagues selected by the journal one has submitted the paper to and finally by the entire community of scientists doing similar research who decide whether your research is replicable, valuable and worth stealing.

    Science works and has done so for 400 years. Auditing…?

  126. > I think I asked the wrong question.

    I think you’re about to ask more questions:

  127. The auditor asked his fan to make these FOIA requests

    Dear Mr Palmer,

    I hereby make a EIR/FOI request in respect to any confidentiality agreements)restricting transmission of CRUTEM data to non-academics involing the following countries: [insert 5 or so countries that are different from ones already requested]

    1. the date of any applicable confidentiality agreements;
    2. the parties to such confidentiality agreement, including the full name of any organization;
    3. a copy of the section of the confidentiality agreement that “prevents further transmission to non-academics”.
    4. a copy of the entire confidentiality agreement,

    I am requesting this information for the purposes of academic research.

    Thank you for your attention.

    Yours truly,

    yourname

  128. > Science works and has done so for 400 years. Auditing…?

    Mesopotamia:

    http://www.lse.ac.uk/accounting/pdf/swag2012plenary6.pdf

  129. Ken Fabian says:

    I’d like to put in a good word for deferring to experts and appealing to authorities. Even if it’s highly desirable for those within a field to be personally confident they are building upon sound foundations and make the effort to apply skepticism and review the relevant work of prior others, responsible decision making requires a default position of acceptance of the validity of the work of other (credentialled) experts. With some caveats – such as that such expert advice conforms to professional standards and where a different conclusion is made that is not in line with the mainstream/consensus that be made clear.

    I don’t think it’s naive or irresponsible to take the expert advice to be valid as a default position, or to expect the boards of review, standards committees (or whatever they get named) and the processes and procedures (rather than individuals reviewing/replicating/reproducing) to do most of the checking, as admirable as individuals doing such review can be. I expect professional application of skepticism to be applied by individuals more to the specific – to review of newly published work or the work of others they intend to build upon for example. I think the prior reviews by others collectively add up to review of the whole and under most circumstances that can be counted as trustworthy.

    I think it’s unreasonable to expect any individuals to be competent to thoroughly review a large, multifaceted body of knowledge like climate science; are you going to redo every prior experiment, test the accuracy of every instrument used, check that no typos or undocumented revisions happened with every text, paper or letter? Trust does come into it and I think our institutions of science, especially the long running ones like UK’s Royal Society and US National Academy of Sciences have earned it. The procedures and processes – and the institutions as well, go a long way as grounds for trust. Having records of prior works and prior reviews leaves the door perpetually open to revisit and review and that puts limits on malpractice and slipping standards, but even if professionals choose to personally review relevant prior works it’s a mistake to withhold acceptance of the validity of the work of others and treat it provisionally as wrong whilst doing so; it constrains the ability to make responsible decisions.

    To misrepresent the work of others, to fail to make clear that an expert opinion or conclusion is not in keeping with an overwhelming majority of peers looks like a breach of professional standards to me. Laypeople can believe and say what they like but when you act in a professional capacity you no longer have any such right. Those holding positions of trust and responsiblity also have no such right.

    Withholding acceptance of newly published work during an effort to review it seems like good practice but to withhold acceptance of the existing, broader body of knowledge – the work of prior others – is not; on the contrary for those in positions of trust and responsibility there’s a lot of legal precedent for failure to take on board expert advice can – should harms arise – lead to charges of negligence. Specific legal requirements, like there are around engineering or medicine, have yet to emerge around climate expertise – the Dutch court decision an early sign that it will come – but I’m sure that people who do hold positions of responsibility and for whatever reasons are seeking to avoid it are aware that ignoring expert advice can have legal consequences.

    Isn’t the real political and commercial importance of credentialled contrary viewpoints in providing such people with grounds to dodge responsibility and give justification for failing to heed the mainstream expert advice? “The science is uncertain and in dispute” make credible justifications – as long as some credentialled experts are willing to lend their names to it – for ignoring climate responsibility. What also aids this responsibilty dodging is that when it comes to climate change taking action and refraining from action are inverted – continuing unconstrained GHG emissions, whilst actually the continuation of strong, planet altering action has become seen as refraining from action and efforts to contrain emissions have become seen as the taking of strong action. “Let’s not be hasty” is not a postion that serves well when a strong action, that has cumulative and irreversible consequence, is ongoing.

  130. mt says:

    Snark, I am an open source guy. If someone thinks my code sucks and he wants to tell the world why, he should have as much right to my code as my favorite admiring student does. The key principle of open source is that one doesn’t ask “why”.

    Does the “auditor” add value?

    Well, in the case of he whom we usually call The Auditor, who shall not be named, on net, I’d say he doesn’t. But if it were clear what he could have of mine (my code and data leading to publication) and what he can’t (my failures and blind alleys, my tasteless jokes, my expressions to friends of a heartfelt desire to punch a certain fellow Wisconsin alumnus in the face, etc.) he would do a lot less harm.

    As for this “analysis should be reproducible both on the same platform and across platforms, but it would be surprising to me if this wasn’t something a researcher checked” it is very much the case that it would be surprising to many nonscientists, but it is very rarely the case. Scientists are naive and backward about software development, compiler vendors are deluded about what scientists need, and reproducibility is still a fringe movement in science, even more fringe than the Open Science movement, perhaps because its underlying principles are so mundane.

    But the fact that reproducibility is mundane strikes me as no excuse for not doing it.

  131. angech says:

    “Which brings us back to the question of what is the public’s business and what is my own.”
    Everything defined by law as being the public’s business is the public’s business.
    A second division is basically between thinking and acting.
    As long as you do not act physically by downloading information,uploading material, blogging or reading blogs, typing, writing or photographing stuff [I am sure there are more] you are free.
    Anything in your head is your business.
    Different countries different laws but somewhere they all have a law to obtain any documents you have if there is a possibility your work is the public’s business.
    I don’;t like it either but there it is.
    Free thought but not free speech, in public and private.
    Communicating scientific ideas is an act.

  132. The notion that reproducibility is lax is only true in a very narrow sense. At the basic level of measurement it has never been of greater interest or received as much attention. Interlaboratory Comparisons, ISO Accreditation, and Proficiency Testing are at the core of any laboratory worth the name. Over the past 30 years ‘measurement uncertainty’ has gone from an esoteric idea to become a topic that is commonplace in both industry and government.

  133. dikranmarsupial says:

    MT wrote “it costs me nothing to make my Matlab routine public”

    I can tell you from experience that this is not true, at least if your MATLAB routine is useful for something other than merely reproducing the particular set of results in your paper. You will get requests from users for help and advice, bug/feature fixes (for stuff that affects their application, but not yours, etc. Unless you are willing to ignore such requests, there is a cost. It is also advisable to write some user documentation for the software (which you personally don’t need), which is another cost. This is not dependent on how good you are at software engineering (all software is constructed to a budget anyway).

  134. dikranmarsupial says:

    snarkrates, the competence of the auditor ought to be irrelevant to its acceptability. As scientists we should be open to technical criticism from any source, as we ought to be engaged in a search for scientific truth and all forms of technical scrutiny potentially reveal flaws that could be usefully corrected.

    The accusations of fraud etc. is bad behaviour, but that is independent of the auditing, and I suspect those that do so would do so anyway. It is the bad behaviour that should be criticised, rather than the auditing.

  135. dikranmarsupial says:

    Regarding “As for this “analysis should be reproducible both on the same platform and across platforms, but it would be surprising to me if this wasn’t something a researcher checked”

    The project I am working on at the moment has taken a couple of hundred processor years to run so far. Just how do you think I should check its cross-platform correctness (please could someone provide me with a second and a third high performance computing facilities, one based on Windows machines and the other with Macs, with different versions of MATLAB and octave, oh and the power to run them)? I suspect similar problems crop up in other branches of science, but this demonstrates that even for a branch of computer science, this is an unreasonable expectation.

  136. snarkrates says:

    Willard, now you are just being silly. Science does not need auditing. Auditing adds nothing–much as you are doing with this discussion.

  137. snarkrates says:

    MT,
    I am also for open science, but in order for it to be science, you have to control for errors, especially systematic errors. Sharing code, especially when the code makes its way into the hands of people (skilled or not) who apply it far beyond anything its authors had in mind is a recipe for propagating systematic errors.

    Raw data–anyone can and should be able to access it.
    Thorough description of methodology–should be available to anyone.
    Processed data and code–should be repeatedly checked by the research group involved and frozen and archived when the research is published. Sharing said data should be at the discretion of the authors.
    If code is to be released to the public (and yes I favor this), it should be validated and the documentation should reflect its limitations and range of applicability
    The reason I think this is because the above precautions are more likely to control errors and result in good science.

    When it comes to anti-science or fake science, we owe the practitioners thereof nothing–not even courtesy. If someone has demonstrated a history of hostility to science–like the Discovery Institute or the Auditor–then nothing good can come of interacting with them.

  138. Sam Schwarzkopf says:

    Thanks for your post, Physics (may I call you Physics?). This actually covers more or less what I’d been thinking to write a post about as well so maybe I won’t now ;)…

    This all sounds pretty reasonable to me. As I keep saying, but various parties are hell-bent to misapprehend, all data needed to reproduce should be made publicly available at publication (or before) – provided that it’s ethically acceptable. For human work, participant confidentiality, especially if they are patients, must always trump scientific openness. But usually the data needed to reproduce a finding can be completely anonymised. The question of data access is therefore completely irrelevant and that whole discussion is a red herring.

    …a number of the usual suspect kindly turned up in the comments to illustrate some of what they were trying to suggest.

    Yes the past week had my irony detector running at full steam. Bizarre how certain people don’t get how they are proving their critics right by their behaviour.

    We start to trust a scientific result when it has been replicated and reproduced sufficiently.

    I also like that (I think) you make a distinction between replicating and reproducing. In these discussions the two are often conflated (say in the Reproducibility Project for psychology). While I admit this is a semantic choice, we do need to distinguish being able to reproduce a result using the same data and methods and replicating it by doing the same (or a similar) experiment. In the long run, the latter is much more important.

  139. Sam,
    Thanks for the comment.

    Physics (may I call you Physics?).

    It’s a good deal better than what many others call me 🙂

    I also like that (I think) you make a distinction between replicating and reproducing.

    I must admit that I’ve managed to get slightly confused as to which term means which, but at the end of the day, I do think that we need to distinguish between a detailed reproduction of a result using the same code/data and an attempt to get the same result using something similar, but not necessarily the same data/code.

    I have slightly softened my stance on the former. Partly because I think there are many different ways to contribute to our improved understanding and even though some kind of detailed audit may not be how I would expend research resources, it doesn’t have no value. Also, as MT points out, it can be important to show that – for example – a code produces the same result, irresepective of the compiler options/platform.

    Yes the past week had my irony detector running at full steam.

    Mine’s been running full steam ever since I started this blog. It’s almost got to the point, sadly, where it’s pretty much what I expect.

  140. dikranmarsupial says:

    Mine keeps getting sent back to the factory for repair/recalibration

  141. Vinny Burgoo says:

    dikranmarsupial, I’d send it to a science museum instead. It appears to have broken the laws of physics.

  142. dikranmarsupial says:

    :o) Perhaps they should stop making the pointers out of lead?

  143. MartinM says:

    I think it’s unreasonable to expect any individuals to be competent to thoroughly review a large, multifaceted body of knowledge like climate science; are you going to redo every prior experiment, test the accuracy of every instrument used, check that no typos or undocumented revisions happened with every text, paper or letter?

    Funny you should mention that, since that does appear to be pretty much what Judith Curry’s advocating now. Forget everything climate scientists have ever discovered and start over. 19th Century, here we come!

  144. Martin,
    It’s a “we don’t like the answer, let’s start again” argument.

  145. dikranmarsupial says:

    As if that would fix anything. As soon as we got back to the point where the consensus would be that anthropogenic climate change is likely to have substantial negative consequences, and that *is* what would happen, it would be “lets start again”. More cunctatory rhetoric.

  146. BBD says:

    Groundhog Physics…

  147. anoilman says:

    MT: Just cause its reproducible doesn’t mean you have the whole picture;

    But it is the first step, especially with science and physics.

    If something is off, I try to understand why. Actually, even if it is reproducible… I still check. Maybe poke and prod it a bit to make sure it squirms just right.

    Over years I’ve developed a bad case of bracket-itis, and I’m very careful. Often I worry about making my stuff work on incompatible platforms. Most people don’t.

    [Some guy gave me an algorithm that had an 8 bit signed char, type cast to a 16 bit unsigned. (What kind of numpty does that?) Seriously.. that just isn’t defined in most compilers, let alone documented in any specs. I wound up faking out what his compiler did.]

  148. > Auditing adds nothing–much as you are doing with this discussion.

    That’s too generous, for I mostly remove things from discussions.

    In fairness, I do admit that I added GREMLINS as a programming language and a link to the T account of the Data Parasite. But then I defused your F word, your “slow student” inconsiderate remark, and Frederick’s “excuse.”

    I also added the concept of tolling, but then I refuted your “single result,” your “but a journal,” your “but it’s not science,” and your “but it’s not physics,” while defusing your hockey stick provocations

    In fairness, I also added versioning systems, checksums, the concept of intersubjectivity, and examples of auditing.

    As you can see, your pleading became more and more special. The problem now is that either you remain within your physics envy, or you go for the purest of all:

    Of course you can go do other things, like going full ad hom and hammer the table with proofs by assertion. Don’t worry, Snarky, I’m about to remove these last two.

    And then there’s history of science.

  149. John Mashey says:

    “I have slightly softened my stance on the former. Partly because I think there are many different ways to contribute to our improved understanding and even though some kind of detailed audit may not be how I would expend research resources, it doesn’t have no value. Also, as MT points out, it can be important to show that – for example – a code produces the same result, irresepective of the compiler options/platform.”

    “Same” is a useful adjective for some kinds of codes, like digital logic simulators.
    But “same” often means “within tolerance” or “significant digits same”.
    As i mentioned in an earlier example, some codes can be spectacularly dependent on low order bit, give different results… And still be useful.
    Some protein-folding codes were very sensitive, and a poor decision early in process could generate results far from reality. Other codes are relatively safe against midest differences of FP handlng.

    Integers are real things, but floating point values in computers are not real numbers. 🙂

  150. The Very Reverend Jebediah Hypotenuse says:

    Willard:

    That’s too generous, for I mostly remove things from discussions.

    Mostly you discuss discussions.

    You never meta-discussion you didn’t want to.

  151. The Very Reverend Jebediah Hypotenuse says:

    John Mashey:

    Integers are real things, but floating point values in computers are not real numbers.

    Numbers may be equal.
    Things cannot be equal.
    Therefore, numbers are not things. Even when the numbers are real. 🙂

  152. Michael 2 says:

    I contribute this idea:

    When I submit a pie chart and/or report up the chain of command, derived entirely or in part from a database using a SQL (Structured Query Language) query, I usually include the text of the query.

    The people that receive the report almost certainly do not understand SQL. I include the query for two reasons: (1) so that I can discuss the report and remember what query was actually used and (2) to show the chain of command that *I* understand SQL queries.

    If I do not include the query, the resulting pie chart or line graph might as well be a random number since it lacks foundation. The SQL query, understood or not, provides the *link* between raw data and the resulting graph: How did I get from *this* to *that*.

    It is up to the reader to learn SQL sufficient to understand and interpret the query. I provide it as a buttress of confidence that the graph is a representation of the data.

    It is possible that I make a mistake. If so, and it is noticed, what happens next depends upon the culture where you work. In highly competitive realms your enemies will leap on that error and your employment could end. Their own errors may be more egregious but better hidden. This kind of culture inhibits any kind of transparency and the organization will suffer inefficiencies because of it.

  153. Thank you for the kind words, Reverend. You’re assuming a thing called meta-discussion, whence the very idea of a discussion presumes these kind of cues, at least for humans. While going meta all the way up helped Bertrand solved paradoxes and might have inspired type theory, the walls of separation are never quite clear in human affairs.

    There could be ways to substantiate Snarky’s claim, e.g.:

    At various machine learning conferences, at various times, there have been discussions arising from the inability to replicate the experimental results published in a paper. There seems to be a wide spread view that we need to do something to address this problem, as it is essential to the advancement of our field. The most compelling argument would seem to be that reproducibility of experimental results is the hallmark of science. Therefore, given that most of us regard machine learning as a scientific discipline, being able to replicate experiments is paramount. I want to challenge this view by separating the notion of reproducibility, a generally desirable property, from replicability, its poor cousin. I claim there are important differences between the two. Reproducibility requires changes; replicability avoids them. Although reproducibility is desirable, I contend that the impoverished version, replicability, is one not worth having.

    http://cogprints.org/7691/

    I think the author goes a bridge too far: if the journal to which he submits his paper can’t replicate his experiment, then I duly submit that the research (at least insofar as machine learning is concerned) may not be worth be declared peer-reviewed.

    If you don’t want anyone to look at your code, you have little choice than either to offer turnkey code that rests on public libraries of which you have no authority, or we can develop binaries that could be checked by independent authories, say with what I called checksums.

    It goes without saying that reproducibility is better than replicability. Science and solipsism don’t mix. It’s quite obvious that there are many issues regarding the extent to which replicability is possible according to the various research fields. To repeat my main point for the nth time, these are issues that belong to institutions, starting with those responsible for the lichurchur, which is arguably crap.

    Lichurchur created a mess and scientists are paying the price. We need more editors than we need auditors. When auditors are involved, companies usually don’t go very well.

    Oh, and here would be the earliest auditable document I could find:

    https://en.wikipedia.org/wiki/Euclid%27s_Elements

    I doubt anyone here really researched the epistemological issues of the concepts we’re discussing, so I really doubt I can be tossed around as the meta-guy.

  154. Pingback: Transparency | …and Then There's Physics

  155. verytallguy says:

    I really doubt I can be tossed around as the meta-guy.

    I sure never meta guy quite like you before!

    (I’ll get my coat…)

  156. Nice, Very Tall, but not so fast:

    Despite valid concerns ranging from the true reproducibility of experimental science to the logical inconsistencies identified by philosophers of science, experimental reproducibility remains a standard and accepted criterion for publication. Hence, investigators must strive to obtain information with regard to the reproducibility of their results. That, in turn, raises the question of the number of replications needed for acceptance by the scientific community. The number of times that an experiment is performed should be clearly stated in a manuscript. A new finding should be reproduced at least once and preferably more times. However, even here there is some room for judgment under exceptional circumstances. Consider a trial of a new therapeutic molecule that is expected to produce a certain result in a primate experiment based on known cellular processes. If one were to obtain precisely the predicted result, one might present a compelling argument for accepting the results of the single experiment on moral grounds regarding animal experimentation, especially in situations in which the experiment results in injury or death to the animal. At the other extreme, when an experiment is easily and inexpensively carried out without ethical considerations, then it behooves the investigator to ascertain the replicability and reproducibility of a result as fully as possible. However, there are no hard and fast rules for the number of times that an experiment should be replicated before a manuscript is considered acceptable for publication. In general, the importance of reproducibility increases in proportion to the importance of a result, and experiments that challenge existing beliefs and assumptions will be subjected to greater scrutiny than those fitting within established paradigms.

    http://iai.asm.org/content/78/12/4972.short

    There’s no hard and fast rules regarding replicability, at least in practice, simply because it’s a judgment call most of the times.

    The only thing I’m quite confident to say is that one does not simply pretend to vet results he has neither replicated nor reproduced.

    The nexus of the two concepts we’re exploring should be specifications.

  157. jsam says:

    Tol getting all uppity about transparency is quite funny I wonder when he’ll divulge the Global Warmers’ Propaganda S Funds donors?

  158. Andrew Dodds says:

    Aom –

    Indeed.

    Don’t trust operator precedence [nb should be able to, but code also has to be readable by someone less qualified in 5 years time]
    Don’t trust floating point implementation/precision.
    Don’t trust anything that relies on the manner of the underlying implementation.
    Don’t trust library code.
    Definitely don’t trust non-library code.
    Don’t trust unit tests. Especially if they pass. Even more especially if a whole set passes first time.
    Don’t trust manual tests.
    Don’t trust code that you wrote and tested as working last Tuesday.

    And remember that any bug released into the wild for more than two release cycles will become a feature to half your user base.

    Personally, I’m amazed that this computer has worked long enough for me to type this post.

  159. Michael 2 says:

    Andrew Dodds wrote “Personally, I’m amazed that this computer has worked long enough for me to type this post.”

    That was an extended chuckle out loud moment.

  160. snarkrates says:

    Willard, my bad. Let’s just say you’ve contributed nothing of value to the thread, ‘kay?

    Honestly, your posts make me wonder whether you’ve ever contributed a paper to a scientific journal.

  161. Michael 2 says:

    snarkrates writes “Willard, my bad. Let’s just say you’ve contributed nothing of value to the thread”

    You seriously don’t get it. Willard contributes something powerful that no one else contributes: How to argue. Y’all can feel as smug as a bug in a rug but if you want to change someone else’s mind you must “up your game” a bit and Willard shows the way. If you believe change is impossible you might as well have a bit of fun and Willard show the way to that, too.

  162. anoilman says:

    Andrew Dodds: Deniers will say we’re not being skeptical yet we constantly distrust everything.

  163. anoilman says:

    snarkrates: Willard is utterly concerned with ‘how’ we argue, which is often BS on BS.

    While he’s quite crisp in his understanding and support for what we understand of Global Warming, he’s very even handed about cutting apart BS arguments with a good dose of moderation.

    I would recommend that you figure out what you are doing wrong.

    ClimateBall more often than not ends like a scene from History of The World Part 1;

  164. Thank you for the kind words, Snarky.

    In return, here’s another project you might like, since it addresses one of MT’s concerns:

    Replication of scientific experiments is critical to the advance of science. Unfortunately, the discipline of Computer Science has never treated replication seriously, even though computers are very good at doing the same thing over and over again. Not only are experiments rarely replicated, they are rarely even replicable in a meaningful way. Scientists are being encouraged to make their source code available, but this is only a small step. Even in the happy event that source code can be built and run successfully, running code is a long way away from being able to replicate the experiment that code was used for. I propose that the discipline of Computer Science must embrace replication of experiments as standard practice. I propose that the only credible technique to make experiments truly replicable is to provide copies of virtual machines in which the experiments are validated to run. I propose that tools and repositories should be made available to make this happen. I propose to be one of those who makes it happen.

    http://arxiv.org/abs/1304.3674

    Or not, for it’s not really Science according to what you hammer so far, and it doesn’t appear in the lichurchur. You can switch to ad hominem mode if you please. It’ll make your new nick stick.

  165. snarkrates says:

    I’d support that effort, Willard. OTOH, if your result depends significantly on the platform you run it on, you probably aren’t ready for prime time.

  166. snarkrates says:

    An Oilman, I have little patience for people arguing about science when they clearly don’t understand how it is actually done. This goes for Feyerabend (who was a fricking moron), and it goes for Willard (who is not).
    However, what I have the least patience for is folks like the “Auditor,” who produce mainly anti-science while trying to make it impossible for the scientists to do their job.

    Willard doesn’t even give any indication of having actually read anything I’ve posted. I will say it in a couple of sentences.

    Independent replication means independent–starting with the same data, applying the same method as described and getting the same results. If you aren’t doing that–particularly if you can’t apply the method without having the code spoonfed to you–then you aren’t doing science.

  167. anoilman says:

    Willard appears obtuse largely because he’s commenting on the technique’s employed most of the time. Not so much the content.

    So… when he says you’re using an Ad Hominem..
    https://andthentheresphysics.wordpress.com/2016/01/30/research-integrity/#comment-72029

    Perhaps you could collaborate elsewhere because;
    “Bottom line: If you would like cooperation or collaboration, don’t be an arsehole.”
    https://andthentheresphysics.wordpress.com/2016/01/30/research-integrity/#comment-72021

    Or you could learn from someone in a different field about what they are on about. Tone your ego down, and try to talk constructively. I know. Its hard some times.

  168. John Hartz says:

    When I first read the OP, I thought to myself, “This one is rather innocuous and will not attract a lot of chatter.” Boy, was I wrong.

  169. dikranmarsupial says:

    “I propose that the only credible technique to make experiments truly replicable is to provide copies of virtual machines in which the experiments are validated to run.”

    One wonders how we can be sure the emulator software for the virtual machine runs in a replicable manner on all platforms for all programs that might be executed on it (and that “software rot” never happens and the compilers will indefinitely continue to compile the VM {em,sim}ulator in a functionally identical manner etc.)? ;o)

    In principle this would be a good idea. The problem is that it will have a computational overhead, which would mean that the computational expense ceiling for our research is lowered and we then can’t do the experiments in quite as thorough manner as we would like, or we can only attack problems that are a little smaller (unlike our competitors that don’t use a VM).

  170. > Independent replication means independent–starting with the same data, applying the same method as described and getting the same results. If you aren’t doing that–particularly if you can’t apply the method without having the code spoonfed to you–then you aren’t doing science.

    This claim, besides not “doing science” and thereby self-refuting any authority claim it might carry regarding “doing science,” conflates replication with reproduction.

    Let’s illustrate the distinction with the warrant a proof may bring. Replicating a proof would be like going through every steps of a demonstration to see if we understand the author’s sketch. Reproducing a proof would be more like getting the same theorem but using a (slightly, somewhat, very, etc.) different route.

    The two activities might be intertwined, but they carry opposite objectives. The point of replication is to get the “same” (i.e. not identical, but equivalent) results using the same means. The point of reproduction is to get the similar results using similar but different means.

    Yet, the criteria for establishing that either one of them is met vary from one field to the next. Proofs are usually sketchy. Authors convey the main ideas well enough so others can fill in the gaps themselves. Sketchy proofs might one day disappear: we now have theorem provers like COQ (to name one name). However, proof theory does not even encompass all of logic, which means mathematics might always require human intervention.

    That said, Alphabet is beating master go players as we speak:

    http://bits.blogs.nytimes.com/2016/01/27/alphabet-program-beats-the-european-human-go-champion/

    Replication and reproduction reinforce one another. Sometimes, you can only get one. The best is reproduction, like Snarky implies, but peer-review is (or rather should be) about replication, contrary to Snarky’s proof by assertion.

    ***

    The very distinction between data and methods is itself quite messy: sometimes, the novelty of a research comes with the data (think archeology); sometimes it comes from the methods (think metrology); sometimes data are methods (think computer science).

  171. > The problem is that it will have a computational overhead, which would mean that the computational expense ceiling for our research is lowered and we then can’t do the experiments in quite as thorough manner as we would like, or we can only attack problems that are a little smaller (unlike our competitors that don’t use a VM).

    Indeed. We’re far from the victory chants we could hear years ago at the Auditor’s by MrPete. It still is something that would be worthwhile.

    A similar argument also applies to peer-reviewed lichurchur in general. The costs of journals is a great barrier for lots of less priviledged institutions. Some may argue that the lichurchur still implements the colonial establishments from which it sprang.

  172. dikranmarsupial says:

    ” The costs of journals is a great barrier for lots of less priviledged institutions.”

    Given there are (prestigious) journals that are open access and free for both author and reader (e.g. jmlr.org), it is not clear why this should be true (as it is). Most of the (real) costs is the time of the reviewers and editors, which we provide for free already. Of course this does require the involvement of the top scientists in the field and good reviewing standards to make it work, you can’t just set up a website and call it a journal and expect it to work.

  173. mt says:

    ” You will get requests from users for help and advice, bug/feature fixes (for stuff that affects their application, but not yours, etc. Unless you are willing to ignore such requests, there is a cost. It is also advisable to write some user documentation for the software (which you personally don’t need), which is another cost. This is not dependent on how good you are at software engineering (all software is constructed to a budget anyway).”

    Global warming is not the only part of modern life where things are more expensive than they appear to be on a budget. “I refuse to restrain carbon emissions on the grounds that my descendants might be able to afford the vastly higher costs of adaptation better than I can afford the costs of mitigation” is a way of wasting resources. “I refuse to release my code on the grounds that it might be useful to somebody else” is as well.

    It’s absolutely the case that people will misuse scientific code if it is released. Deniers abusing MODTRAN is a familiar example. It’s also the case that reasonable people may make reasonable demands on your time and attention that are not helpful to your own ambitions, which is something of a minor but perennial ethical dilemma. (This problem is well-known in the open source world.)

    None of it means that it costs you anything to put your code on an FTP site, and none of it justifies a refusal to do so on the grounds that somebody else might take the trouble to look at it!

    One wonders, sometimes, what the purpose of science is.

  174. MT,

    None of it means that it costs you anything to put your code on an FTP site, and none of it justifies a refusal to do so on the grounds that somebody else might take the trouble to look at it!

    I agree. I don’t think that one should refuse to release something because it might be misused. Has anyone actually made that argument on this thread?

  175. > It’s absolutely the case that people will misuse scientific code if it is released.

    People abuse it even when it’s not.

    ***

    > One wonders, sometimes, what the purpose of science is.

    To wonder about its purpose, at least sometimes.

  176. anoilman says:

    In general I agree with MT\Anders.

    The only caveats in my eyes are whether there’s IP agreements limiting sharing\ownership. (Yeah, there’s scientists\institutions like that out there.) I’ve already mentioned limits on data sharing.

    My experience in Industry is that closed access = poor quality and substandard learning in the long run. Even sharing bits and pieces can result in a much better understanding of what you’re doing.

    And no… you don’t have to listen to anyone who comments on what you’re doing. I’ve often contacted original open source developers and I’ve been surprised at just how supportive they are. Most people can’t contain themselves if someone is looking at their code and talking about it.

    The only other caveat I can think of would be instances where you’re just not doing anything that complicated, so someone else should be able to replicate it with the simple verbiage you provide.

  177. dikranmarsupial says:

    “I agree. I don’t think that one should refuse to release something because it might be misused. Has anyone actually made that argument on this thread?”

    Not me, I was just disagreeing that it has no cost, which it clearly does. Personally I’d view not responding to requests for support for the software I make available as a contravention of the “golden rule” and try to be helpful.

    MT wrote ““I refuse to release my code on the grounds that it might be useful to somebody else” is as well.” this is a pretty unfair characterization of what I actually wrote. Nowhere did I say anything about refusing to release code on *any* grounds. What I was arguing for is resources to release it in a form that us most useful to its likely users in a way that doesn’t create undue additional workload for the author (perhaps MT didn’t read my initial comment on this). This is because I am not primarily interested in making code available for replication of my results, but precisely because I hope it will be useful to somebody else (who might find e.g. some user documentation helpful).

  178. dikranmarsupial says:

    “One wonders, sometimes, what the purpose of science is.”

    Partly to provide shoulders for other scientists to step on so they can see a little further (although sadly we can’t all be giants, although that is not a requirement to contribute to progress). Perhaps by providing software in a form that is useful to them but without it being unduly at the expense of your own time for shoulder-stepping and further-seeing. One thing is certain which is that replication is not the purpose of science.

  179. snarkrates says:

    How about this for a purpose for science: to generate reliable understanding of natural phenomena.
    Also, let me make a couple of things clear:
    1)I am not against sharing code. I am against the presumption that any moron has the right to demand my code and data.
    2)I do believe that code and data should be frozen at the time of publication and archived.
    3)I am against anything that prevents scientists from doing science.

  180. > [R]eplication is not the purpose of science.

    Breathing is not the purpose of porpoises either.

  181. Pingback: On transparency | …and Then There's Physics

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s