Without accessing the code how can a reviewer confirm if the simulation results are correct?












31















I have been reviewing the articles for various top-rank journals and conferences for the last few years. After all these experiences, I can tell you there is no way to confirm the correctness of simulation results. Therefore, I usually make comments on design, procedure, mathematical and analytical analysis.



In the results section, I can ask some questions about why this or that is so, but how can I judge if the simulation was really performed or these are just fabricated graphs?



This question came in my mind because I observed on few occasions, during review process, a reviewer ask for including new results, which in my opinion required a lot of coding and effort to implement, but author responded within 7-10 days with new results and improved article.










share|improve this question




















  • 1





    This question came in my mind because I observed on few occasions, during review process, a reviewer ask for including new results, which in my opinion required a lot of coding and effort to implement, but author responded within 7-10 days with new results and improved article.

    – MBK
    2 days ago








  • 30





    Why would this be different from an author stating the result of an experiment? The experimental protocol should be given, but there would be no expectation that the reviewer would perform the experiment again to verify that it came out as the authors said. In either case, the author could be mistaken or dishonest, but at least in the experiment case that wouldn't be something the reviewer would know.

    – David Thornley
    2 days ago






  • 6





    @MBK: Regarding the case you mention in comments, it is surely very possible that the authors weren’t starting from scratch in implementing the reviewers’ suggestions, but had already independently considered those suggestions or something related, and so had a significant amount of the necessary code already written?

    – PLL
    2 days ago






  • 2





    @MBK Regarding the case you mention in comments, I suspect you're simply not that good of a programmer as those who you reviewed. I can see how it is hard to accept that others are faster programmers, even more so when these others are people you review (hold control over). It is human, but there is absolutely no reason to suspect foul-play, out of what you described.

    – Andrei
    yesterday






  • 9





    With accessing the code how can a reviewer confirm if the simulation results are correct? Unearthing subtle bugs (the sort that tend to produce incorrect rather than bizzare results) is a skill, and one you can't assume reviewers will have. I recall an off-by-one error in some of my code that cancelled in the final result (the difference between list indices was what mattered) until we repurposed the code; it was still a while before we caught the bug. Your last paragraph proposes fraud rather than error, and that's a very different situation

    – Chris H
    yesterday


















31















I have been reviewing the articles for various top-rank journals and conferences for the last few years. After all these experiences, I can tell you there is no way to confirm the correctness of simulation results. Therefore, I usually make comments on design, procedure, mathematical and analytical analysis.



In the results section, I can ask some questions about why this or that is so, but how can I judge if the simulation was really performed or these are just fabricated graphs?



This question came in my mind because I observed on few occasions, during review process, a reviewer ask for including new results, which in my opinion required a lot of coding and effort to implement, but author responded within 7-10 days with new results and improved article.










share|improve this question




















  • 1





    This question came in my mind because I observed on few occasions, during review process, a reviewer ask for including new results, which in my opinion required a lot of coding and effort to implement, but author responded within 7-10 days with new results and improved article.

    – MBK
    2 days ago








  • 30





    Why would this be different from an author stating the result of an experiment? The experimental protocol should be given, but there would be no expectation that the reviewer would perform the experiment again to verify that it came out as the authors said. In either case, the author could be mistaken or dishonest, but at least in the experiment case that wouldn't be something the reviewer would know.

    – David Thornley
    2 days ago






  • 6





    @MBK: Regarding the case you mention in comments, it is surely very possible that the authors weren’t starting from scratch in implementing the reviewers’ suggestions, but had already independently considered those suggestions or something related, and so had a significant amount of the necessary code already written?

    – PLL
    2 days ago






  • 2





    @MBK Regarding the case you mention in comments, I suspect you're simply not that good of a programmer as those who you reviewed. I can see how it is hard to accept that others are faster programmers, even more so when these others are people you review (hold control over). It is human, but there is absolutely no reason to suspect foul-play, out of what you described.

    – Andrei
    yesterday






  • 9





    With accessing the code how can a reviewer confirm if the simulation results are correct? Unearthing subtle bugs (the sort that tend to produce incorrect rather than bizzare results) is a skill, and one you can't assume reviewers will have. I recall an off-by-one error in some of my code that cancelled in the final result (the difference between list indices was what mattered) until we repurposed the code; it was still a while before we caught the bug. Your last paragraph proposes fraud rather than error, and that's a very different situation

    – Chris H
    yesterday
















31












31








31


7






I have been reviewing the articles for various top-rank journals and conferences for the last few years. After all these experiences, I can tell you there is no way to confirm the correctness of simulation results. Therefore, I usually make comments on design, procedure, mathematical and analytical analysis.



In the results section, I can ask some questions about why this or that is so, but how can I judge if the simulation was really performed or these are just fabricated graphs?



This question came in my mind because I observed on few occasions, during review process, a reviewer ask for including new results, which in my opinion required a lot of coding and effort to implement, but author responded within 7-10 days with new results and improved article.










share|improve this question
















I have been reviewing the articles for various top-rank journals and conferences for the last few years. After all these experiences, I can tell you there is no way to confirm the correctness of simulation results. Therefore, I usually make comments on design, procedure, mathematical and analytical analysis.



In the results section, I can ask some questions about why this or that is so, but how can I judge if the simulation was really performed or these are just fabricated graphs?



This question came in my mind because I observed on few occasions, during review process, a reviewer ask for including new results, which in my opinion required a lot of coding and effort to implement, but author responded within 7-10 days with new results and improved article.







publications peer-review






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited yesterday









Tommi Brander

3,37511129




3,37511129










asked 2 days ago









MBKMBK

2,4701628




2,4701628








  • 1





    This question came in my mind because I observed on few occasions, during review process, a reviewer ask for including new results, which in my opinion required a lot of coding and effort to implement, but author responded within 7-10 days with new results and improved article.

    – MBK
    2 days ago








  • 30





    Why would this be different from an author stating the result of an experiment? The experimental protocol should be given, but there would be no expectation that the reviewer would perform the experiment again to verify that it came out as the authors said. In either case, the author could be mistaken or dishonest, but at least in the experiment case that wouldn't be something the reviewer would know.

    – David Thornley
    2 days ago






  • 6





    @MBK: Regarding the case you mention in comments, it is surely very possible that the authors weren’t starting from scratch in implementing the reviewers’ suggestions, but had already independently considered those suggestions or something related, and so had a significant amount of the necessary code already written?

    – PLL
    2 days ago






  • 2





    @MBK Regarding the case you mention in comments, I suspect you're simply not that good of a programmer as those who you reviewed. I can see how it is hard to accept that others are faster programmers, even more so when these others are people you review (hold control over). It is human, but there is absolutely no reason to suspect foul-play, out of what you described.

    – Andrei
    yesterday






  • 9





    With accessing the code how can a reviewer confirm if the simulation results are correct? Unearthing subtle bugs (the sort that tend to produce incorrect rather than bizzare results) is a skill, and one you can't assume reviewers will have. I recall an off-by-one error in some of my code that cancelled in the final result (the difference between list indices was what mattered) until we repurposed the code; it was still a while before we caught the bug. Your last paragraph proposes fraud rather than error, and that's a very different situation

    – Chris H
    yesterday
















  • 1





    This question came in my mind because I observed on few occasions, during review process, a reviewer ask for including new results, which in my opinion required a lot of coding and effort to implement, but author responded within 7-10 days with new results and improved article.

    – MBK
    2 days ago








  • 30





    Why would this be different from an author stating the result of an experiment? The experimental protocol should be given, but there would be no expectation that the reviewer would perform the experiment again to verify that it came out as the authors said. In either case, the author could be mistaken or dishonest, but at least in the experiment case that wouldn't be something the reviewer would know.

    – David Thornley
    2 days ago






  • 6





    @MBK: Regarding the case you mention in comments, it is surely very possible that the authors weren’t starting from scratch in implementing the reviewers’ suggestions, but had already independently considered those suggestions or something related, and so had a significant amount of the necessary code already written?

    – PLL
    2 days ago






  • 2





    @MBK Regarding the case you mention in comments, I suspect you're simply not that good of a programmer as those who you reviewed. I can see how it is hard to accept that others are faster programmers, even more so when these others are people you review (hold control over). It is human, but there is absolutely no reason to suspect foul-play, out of what you described.

    – Andrei
    yesterday






  • 9





    With accessing the code how can a reviewer confirm if the simulation results are correct? Unearthing subtle bugs (the sort that tend to produce incorrect rather than bizzare results) is a skill, and one you can't assume reviewers will have. I recall an off-by-one error in some of my code that cancelled in the final result (the difference between list indices was what mattered) until we repurposed the code; it was still a while before we caught the bug. Your last paragraph proposes fraud rather than error, and that's a very different situation

    – Chris H
    yesterday










1




1





This question came in my mind because I observed on few occasions, during review process, a reviewer ask for including new results, which in my opinion required a lot of coding and effort to implement, but author responded within 7-10 days with new results and improved article.

– MBK
2 days ago







This question came in my mind because I observed on few occasions, during review process, a reviewer ask for including new results, which in my opinion required a lot of coding and effort to implement, but author responded within 7-10 days with new results and improved article.

– MBK
2 days ago






30




30





Why would this be different from an author stating the result of an experiment? The experimental protocol should be given, but there would be no expectation that the reviewer would perform the experiment again to verify that it came out as the authors said. In either case, the author could be mistaken or dishonest, but at least in the experiment case that wouldn't be something the reviewer would know.

– David Thornley
2 days ago





Why would this be different from an author stating the result of an experiment? The experimental protocol should be given, but there would be no expectation that the reviewer would perform the experiment again to verify that it came out as the authors said. In either case, the author could be mistaken or dishonest, but at least in the experiment case that wouldn't be something the reviewer would know.

– David Thornley
2 days ago




6




6





@MBK: Regarding the case you mention in comments, it is surely very possible that the authors weren’t starting from scratch in implementing the reviewers’ suggestions, but had already independently considered those suggestions or something related, and so had a significant amount of the necessary code already written?

– PLL
2 days ago





@MBK: Regarding the case you mention in comments, it is surely very possible that the authors weren’t starting from scratch in implementing the reviewers’ suggestions, but had already independently considered those suggestions or something related, and so had a significant amount of the necessary code already written?

– PLL
2 days ago




2




2





@MBK Regarding the case you mention in comments, I suspect you're simply not that good of a programmer as those who you reviewed. I can see how it is hard to accept that others are faster programmers, even more so when these others are people you review (hold control over). It is human, but there is absolutely no reason to suspect foul-play, out of what you described.

– Andrei
yesterday





@MBK Regarding the case you mention in comments, I suspect you're simply not that good of a programmer as those who you reviewed. I can see how it is hard to accept that others are faster programmers, even more so when these others are people you review (hold control over). It is human, but there is absolutely no reason to suspect foul-play, out of what you described.

– Andrei
yesterday




9




9





With accessing the code how can a reviewer confirm if the simulation results are correct? Unearthing subtle bugs (the sort that tend to produce incorrect rather than bizzare results) is a skill, and one you can't assume reviewers will have. I recall an off-by-one error in some of my code that cancelled in the final result (the difference between list indices was what mattered) until we repurposed the code; it was still a while before we caught the bug. Your last paragraph proposes fraud rather than error, and that's a very different situation

– Chris H
yesterday







With accessing the code how can a reviewer confirm if the simulation results are correct? Unearthing subtle bugs (the sort that tend to produce incorrect rather than bizzare results) is a skill, and one you can't assume reviewers will have. I recall an off-by-one error in some of my code that cancelled in the final result (the difference between list indices was what mattered) until we repurposed the code; it was still a while before we caught the bug. Your last paragraph proposes fraud rather than error, and that's a very different situation

– Chris H
yesterday












10 Answers
10






active

oldest

votes


















26














Do you have reason to doubt their veracity or good faith? Are their claims somehow not believable or highly questionable based on your knowledge of the field? If it isn't standard in your field to release code I don't think a reviewer should necessarily demand it, regardless of your feelings about making code public.



The authors should describe their methodology sufficiently for someone else to replicate it; in that way, they are putting their reputations at risk that were someone to duplicate their approach they would find the same results. Fabricating results is a very serious accusation. There are some statistical approaches to test whether data are likely to be fabricated, but the efficacy of this approach depends on the sophistication of the fabrication, and that question is better suited to CrossValidated.



If their work is meaningful in the field, at some point someone will implement their approach again. There is necessarily a bit of trust in science that people do what they say they've done. A reviewer should check that a paper's conclusions follow logically from their results, and that their methodology is described and sound, but they need not verify or replicate the results. As David Thornley points out in a comment on another answer, even simply running code tells you very little about those key factors, it only tells you if there is a blatant lie or mistake.






share|improve this answer





















  • 22





    Do you have reason to doubt their claims? Surely that's the job of any researcher?

    – user2768
    2 days ago






  • 20





    @user2768 Of course I mean beyond basic skepticism. You should doubt their approach as presented: if they say we tested X by doing Y and you know Y is not the correct way to test X, that's a different kind of doubt than them saying we tested X by doing Y and you wondering if they actually just made it up instead of ever doing Y. The standard is to provide enough information for someone else to replicate; if someone was making a truly remarkable claim, there is more reason to demand to see their code than if they are showing an incremental improvement.

    – Bryan Krause
    2 days ago








  • 9





    Sometimes, even if the code is made available and the reviewer is willing to check it for correctness, it may be impractical to rerun the simulation. In my field, it's not uncommon for papers to report millions of CPU hours worth of simulation results. Surely a reviewer cannot be expected to spend a significant fraction of their yearly CPU time quota every time they review a paper. So I agree that a reasonable degree of trust is necessary.

    – Miguel
    2 days ago








  • 4





    @Miguel (Going OT) With such CPU-intensive computations, is it possible to generate a proof that the computations were performed correctly? (Bryan Krause and Phil Miller, nice revision.)

    – user2768
    yesterday






  • 3





    @user2768 It depends. The best benchmark is experiment, but it may not be possible to do it. The worst case scenario is that the simulation is "correct" but the underlying physical model is not. If the problem is important/interesting enough someone will try to reproduce the results using their own implementation or physical model (as others have mentioned here already). Generally, one may test e.g. a new interatomic potential by reproducing known experimental properties (density, self-diffusion coefficient) and then use said potential to predict unknown properties (e.g. reaction barriers).

    – Miguel
    yesterday



















20














You can't really judge if the simulation was really performed. That's why we've had things such as the Schön scandal - the reviewers of those manuscripts didn't detect the fraud either.



What you can do is implement the "smell test". Is this approach feasible? Are the results reasonable? Were there any glaring omissions? If you can't see any obvious problems with the simulation, that's good enough: the real peer review happens after publication.






share|improve this answer































    14















    After all these experiences, I can tell you there is no way to confirm the correctness of simulation results.




    That is not necessarily true. In some cases, it is very easy to discern that a graph cannot possibly be correct or at the least has been badly misconstrued or misinterpreted. I had such a mistake caught in one of my early papers and have caught them in several papers I have reviewed.



    It is not easy to prove that the simulations have actually been performed. However, the Open Science framework is designed to make it easier to verify results of both computational and experimental work.






    share|improve this answer





















    • 4





      Your experience is about confirming incorrectness. It is still hard to confirm correctness.

      – norio
      yesterday






    • 3





      This looks more like a comment on one sentence of the question, rather than an answer to the main question.

      – D.W.
      yesterday



















    7















    there is no way to confirm the correctness of simulation results.




    Simulations should be repeatable, hence, correctness can be checked by re-running the simulation. Of course, the authors might not provide the necessary code, but then you can request the code as a part of the review process.






    share|improve this answer



















    • 4





      @MBK I don't recommend implementing; I recommend repeating. If the authors won't let you repeat (by denying access to code), then I'd be inclined to reject, but I'd consult with the editor.

      – user2768
      2 days ago








    • 10





      @user2768 So, if I were to run some simulations using software I didn't have a license to redistribute, I shouldn't be able to publish my results?

      – David Thornley
      2 days ago






    • 14





      @MBK Just running the same code on the same data tells you very little without actually examining the code to make sure it implements the algorithm(s) of the paper. It tells you that the author(s) didn't outright lie about the results, and that's all.

      – David Thornley
      2 days ago






    • 7





      @SylvainRibault In many areas of research it isn't possible, for various reasons, to share all the raw data involved. Should none of that research be published either? Should we save and distribute blood samples to anyone that wants to verify the results of a study of inflammatory biomarkers? What if the process of analysis destroys the sample? Trust is an integral part of academic research.

      – Bryan Krause
      2 days ago






    • 4





      @SylvainRibault Understood, but although you can argue in favor of open source software there can be good reasons to not distribute code freely; I think it's short-sighted to make that a hard "no" for publishing them, and pointing out that that level of replication is impossible in other fields that have no problem publishing them.

      – Bryan Krause
      2 days ago



















    4














    If I were reviewing a paper which relied to a heavy degree on some computational analysis, but didn't supply the code, I would reject that paper unless the authors could give a good reason: even if they'd used proprietary libraries, I'd want to see the calls they made to those libraries. This is required by many journals in my field (genomics/bioinformatics).



    That said, if the simulations ran for 2 months on 10,000 cores (or even a week on 200 cores), there is not much chance of me reproducing even with the code. And I almost certainly have neither the time, moneny nor expertise to repeat lab experiments in any paper I read.



    I don't think that providing code, although a good practice, is a protection against dis-honesty. In the end there is very little protection against outright fraud, and the review process is not primarily there for that purpose.






    share|improve this answer
























    • There might not be a chance of reproducing it now, but consider that the paper may still be of interest in a decade or two.

      – Trusly
      23 hours ago






    • 1





      The chance of any underlying infrastructure in two decades being compatible with code written now is fairly minimal, unless its packaged as a virtual machine. I'd be skeptical that even a docker image build now will run the same in 20 years time. This is a real and on going problem. See software.ac.uk for an example of people working hard to try on solve this non-trival issue.

      – Ian Sudbery
      20 hours ago



















    4














    I end up reviewing data science type papers where the underlying code is critical. I've started being that guy during reviews, and this is what I ask for:




    1. Code must be available to me as a reviewer. Full stop.

    2. The code needs to have tests that I as a reviewer can run. I can't go through the code to check and make sure it works exactly as advertised, but I can see if the tests are appropriate and that they pass when I run it.

    3. The code should have reasonably good test coverage around the scientific parts, and regression tests should be reasonably justified (i.e. We expect to see this result because of reason X).

    4. If you've re-invented a wheel you have to explain why (even if that explanation is that you didn't know someone else already made it)


    These all seem pretty reasonable to me and address a lot of the concerns about code quality. Most people with well designed code will hit them all just by virtue of having continuous integration tests on a GitHub repository, and I think it's no longer necessary to coddle people with poorly designed code (which is too many people).






    share|improve this answer



















    • 1





      I don't see how this answers the question. The question was, how can we confirm if the simulation results are correct without the code, and your answer is "I require the code".

      – D.W.
      yesterday






    • 1





      If you're a reviewer, "I require the code" is absolutely an answer that you can give. It's a reasonable request, even if you can't use that code to fully reproduce the simulation or analysis (which isn't your job; but making sure that the paper has a baseline level of plausibility is). It's no different from asking for better experimental controls from a biochemist.

      – CJ59
      yesterday











    • @CJ59 how are you defining "poorly designed code"?

      – Underminer
      yesterday











    • If you can write tests that cover most of the code base and that cover most inputs that are reasonable for your problem, your code is probably well-designed. If your tests are tortured and illogical, then your code is probably poorly designed. If you're living in the year 2019 and not bothering to write tests for the code that you're submitting along with your scientific paper, then you are a poorly designed person.

      – CJ59
      yesterday











    • Yes, that's something you can demand as a reviewer -- but it doesn't make it an answer to this question.

      – D.W.
      yesterday



















    4














    In order for simulation work to be precisely reproducible, it would be necessary to have (a) the code for the simulation and (b) the seed for the pseudoramdon generator used to run the code. Unless the code is proprietary, there is no good reason for authors to withhold this information from reviewers, even if the code will not be published as part of the paper. However, publishable simulation studies may be so extensive that it is not feasible even for an energetic skeptical reviewer to repeat the simulations.



    To a degree many simulation studies can be self-verifying. When this is possible reviewers should insist on feasible inherent verification. In non-technical language here are a couple of examples of what I mean.



    (1) Often a simulation will produce several results of which some are novel and some are easily obtained or generally known without resorting to simulation. Then at the very least a reviewer can can confirm that the latter results are valid. Somewhat similarly, simulations may refine results that can only be approximated by probabilistic or other mathematical computations. Then the reviewer can confirm that the results
    are at least approximately correct.



    (2) Very frequently, an important part of a simulation study may be to obtain approximate bounds within which the simulated results are likely (perhaps 95% likely) to lie. If it seems feasible to have obtained such bounds and the paper under review lacks them, then the reviewer should ask for them or for an explanation of their absence.



    Addendum: This is a trivial example illustrating some concepts in (1) and (2). Suppose five dice are loaded so that faces 1 through 6 have respective probabilities (1/12, 1/6, 1/6, 1/6, 1/6, 1/4) of occurring. If all five are rolled, what is the probability that the total is at least 25? A simulation in R statistical software of a million such 5-die
    experiments shows that the fraction of outcomes with totals 25 or more was 0.092903. Is this result believable? The answer is Yes, to about three places.



    The simulated 95% margin of simulation error is "within 0.0006." It is easy to see that the average total is 19.583 and the corresponding simulated result is 19.580. A reasonable 2-place normal approximation is 0.0922.
    This particular example is rich in corroborative
    possibilities, but those are a few.



    Note: Another issue is that, using various kinds of mathematical software, this problem could be solved exactly by advanced combinatorial methods. One exact method is based on this page, except that our dice are biased and outcomes are not equally likely. It is questionable whether simulations should be published if there is a tractable exact solution. One job of a reviewer is to identify papers that should not be published for this reason.






    share|improve this answer

































      4














      It is an unfortunate artifact of history and culture but academia is still in the dark ages with regards to sharing source code. Serious computational researchers will often provide at least a detailed algorithm, and often the code as well. But dabblers who had to learn how to code "on the job", or people very set in their ways and traditions, will often neglect to do so and publications standard rarely require it, even though you would think that anyone in their right mind would agree that all scientific code should be open source (or at least shared with reviewers) if it is to be worth discussion.



      Unfortunately, the established culture does not expect sharing of code, even though the in the analogous case of physical experiments there is an expectation of sharing the exact process down to every detail of method and material such that it may be exactly reproduced by other researchers. I would suspect that the reason for this is that in the grand scheme of things computers are a relatively recent tool of science, and the ability to easily share code is still more recent. That said, we've had ubiquitous internet and zero effort code hosting like Github for over a decade now, so if you ask me, it's about damn time. But it looks like there is still quite a bit of inertia.




      I have been reviewing the articles for various top-rank journals and conferences for the last few years. After all these experiences, I can tell you there is no way to confirm the correctness of simulation results. Therefore, I usually make comments on design, procedure, mathematical and analytical analysis.




      That's about the best you can do. You can also try to intuitively conjecture based on the rough description (if any) of the computational approach, whether the results achieved are credible or not. But ultimately it is impossible to know for sure.



      I try to add a little nag at the end of my reviews about releasing the source code, although I don't think it gets taken seriously very often.




      In the results section, I can ask some questions about why this or that is so, but how can I judge if the simulation was really performed or these are just fabricated graphs?




      Well, the way you phrase it, you can't really know if any graph or result is fabricated, unless maybe you were personally present while the research was done. There is inevitably an element of trust. But without source code, even if you do trust, you cannot offer meaningful critique about some computational parts of the paper. Obviously you can still comment on initial assumptions and the approach chosen. You can comment on how the results are interpreted. But the implementation itself is out of reach until you can see the code. Actually, even providing a detailed algorithm would not be sufficient: The authors' implementation may not necessarily be an exact match for the algorithm they intended.




      This question came in my mind because I observed on few occasions, during review process, a reviewer ask for including new results, which in my opinion required a lot of coding and effort to implement, but author responded within 7-10 days with new results and improved article.




      I don't think it's fair to be suspicious just because they did it a little too quickly. They may just be very good at coding. Personally, my development rate is very variable: Sometimes things just click and I can write code really fast, sometimes simple things take forever. They may know of easier ways to implement the change than you are aware. They may have already coded something similar in separate work and been able to repurpose it quickly.



      If someone were to falsify results, I think they would either respond right away because they don't care, or wait "long enough" to avoid suspicion. If they bothered to wait at all, I don't think they would jeopardize the whole enterprise by waiting too little.






      share|improve this answer































        3














        Usually known examples are reproduced to gain confidence that the simulation is doing what it claims. Then further simulations show the new results, which might not be reproduced by an other group because the did not use the described method or approach described in the paper/talk.



        Due to brevity, often the first part is omitted in journal articles and conference talks. PhD theses contain them more often.



        One of my first publications, it does not contain any breakthrough, is only cited from others to show that they match my results.






        share|improve this answer































          -4














          I refer everyone to
          https://www.cs.auckland.ac.nz/compsci742s2c/resources/p50-kurkowski.pdf



          This is a well known problem in computer science (CS). Consequently, most papers in CS cannot be trusted.






          share|improve this answer



















          • 10





            Can you please summarise the paper in question, explain why it is relevant to this question, and give robust reference (e.g., a DOI). As it stands, this answer becomes completely useless once the link rots.

            – Wrzlprmft
            yesterday






          • 3





            "Most papers in CS" use simulation?

            – David Richerby
            yesterday











          • Your claim is a bold exaggeration whose apparent purpose is to create controversy rather than answer the question.

            – Dmitry Grigoryev
            16 hours ago











          • Looks more like a troll to me, particularly given the account name and history.

            – Lightness Races in Orbit
            13 hours ago










          protected by Alexandros 11 hours ago



          Thank you for your interest in this question.
          Because it has attracted low-quality or spam answers that had to be removed, posting an answer now requires 10 reputation on this site (the association bonus does not count).



          Would you like to answer one of these unanswered questions instead?














          10 Answers
          10






          active

          oldest

          votes








          10 Answers
          10






          active

          oldest

          votes









          active

          oldest

          votes






          active

          oldest

          votes









          26














          Do you have reason to doubt their veracity or good faith? Are their claims somehow not believable or highly questionable based on your knowledge of the field? If it isn't standard in your field to release code I don't think a reviewer should necessarily demand it, regardless of your feelings about making code public.



          The authors should describe their methodology sufficiently for someone else to replicate it; in that way, they are putting their reputations at risk that were someone to duplicate their approach they would find the same results. Fabricating results is a very serious accusation. There are some statistical approaches to test whether data are likely to be fabricated, but the efficacy of this approach depends on the sophistication of the fabrication, and that question is better suited to CrossValidated.



          If their work is meaningful in the field, at some point someone will implement their approach again. There is necessarily a bit of trust in science that people do what they say they've done. A reviewer should check that a paper's conclusions follow logically from their results, and that their methodology is described and sound, but they need not verify or replicate the results. As David Thornley points out in a comment on another answer, even simply running code tells you very little about those key factors, it only tells you if there is a blatant lie or mistake.






          share|improve this answer





















          • 22





            Do you have reason to doubt their claims? Surely that's the job of any researcher?

            – user2768
            2 days ago






          • 20





            @user2768 Of course I mean beyond basic skepticism. You should doubt their approach as presented: if they say we tested X by doing Y and you know Y is not the correct way to test X, that's a different kind of doubt than them saying we tested X by doing Y and you wondering if they actually just made it up instead of ever doing Y. The standard is to provide enough information for someone else to replicate; if someone was making a truly remarkable claim, there is more reason to demand to see their code than if they are showing an incremental improvement.

            – Bryan Krause
            2 days ago








          • 9





            Sometimes, even if the code is made available and the reviewer is willing to check it for correctness, it may be impractical to rerun the simulation. In my field, it's not uncommon for papers to report millions of CPU hours worth of simulation results. Surely a reviewer cannot be expected to spend a significant fraction of their yearly CPU time quota every time they review a paper. So I agree that a reasonable degree of trust is necessary.

            – Miguel
            2 days ago








          • 4





            @Miguel (Going OT) With such CPU-intensive computations, is it possible to generate a proof that the computations were performed correctly? (Bryan Krause and Phil Miller, nice revision.)

            – user2768
            yesterday






          • 3





            @user2768 It depends. The best benchmark is experiment, but it may not be possible to do it. The worst case scenario is that the simulation is "correct" but the underlying physical model is not. If the problem is important/interesting enough someone will try to reproduce the results using their own implementation or physical model (as others have mentioned here already). Generally, one may test e.g. a new interatomic potential by reproducing known experimental properties (density, self-diffusion coefficient) and then use said potential to predict unknown properties (e.g. reaction barriers).

            – Miguel
            yesterday
















          26














          Do you have reason to doubt their veracity or good faith? Are their claims somehow not believable or highly questionable based on your knowledge of the field? If it isn't standard in your field to release code I don't think a reviewer should necessarily demand it, regardless of your feelings about making code public.



          The authors should describe their methodology sufficiently for someone else to replicate it; in that way, they are putting their reputations at risk that were someone to duplicate their approach they would find the same results. Fabricating results is a very serious accusation. There are some statistical approaches to test whether data are likely to be fabricated, but the efficacy of this approach depends on the sophistication of the fabrication, and that question is better suited to CrossValidated.



          If their work is meaningful in the field, at some point someone will implement their approach again. There is necessarily a bit of trust in science that people do what they say they've done. A reviewer should check that a paper's conclusions follow logically from their results, and that their methodology is described and sound, but they need not verify or replicate the results. As David Thornley points out in a comment on another answer, even simply running code tells you very little about those key factors, it only tells you if there is a blatant lie or mistake.






          share|improve this answer





















          • 22





            Do you have reason to doubt their claims? Surely that's the job of any researcher?

            – user2768
            2 days ago






          • 20





            @user2768 Of course I mean beyond basic skepticism. You should doubt their approach as presented: if they say we tested X by doing Y and you know Y is not the correct way to test X, that's a different kind of doubt than them saying we tested X by doing Y and you wondering if they actually just made it up instead of ever doing Y. The standard is to provide enough information for someone else to replicate; if someone was making a truly remarkable claim, there is more reason to demand to see their code than if they are showing an incremental improvement.

            – Bryan Krause
            2 days ago








          • 9





            Sometimes, even if the code is made available and the reviewer is willing to check it for correctness, it may be impractical to rerun the simulation. In my field, it's not uncommon for papers to report millions of CPU hours worth of simulation results. Surely a reviewer cannot be expected to spend a significant fraction of their yearly CPU time quota every time they review a paper. So I agree that a reasonable degree of trust is necessary.

            – Miguel
            2 days ago








          • 4





            @Miguel (Going OT) With such CPU-intensive computations, is it possible to generate a proof that the computations were performed correctly? (Bryan Krause and Phil Miller, nice revision.)

            – user2768
            yesterday






          • 3





            @user2768 It depends. The best benchmark is experiment, but it may not be possible to do it. The worst case scenario is that the simulation is "correct" but the underlying physical model is not. If the problem is important/interesting enough someone will try to reproduce the results using their own implementation or physical model (as others have mentioned here already). Generally, one may test e.g. a new interatomic potential by reproducing known experimental properties (density, self-diffusion coefficient) and then use said potential to predict unknown properties (e.g. reaction barriers).

            – Miguel
            yesterday














          26












          26








          26







          Do you have reason to doubt their veracity or good faith? Are their claims somehow not believable or highly questionable based on your knowledge of the field? If it isn't standard in your field to release code I don't think a reviewer should necessarily demand it, regardless of your feelings about making code public.



          The authors should describe their methodology sufficiently for someone else to replicate it; in that way, they are putting their reputations at risk that were someone to duplicate their approach they would find the same results. Fabricating results is a very serious accusation. There are some statistical approaches to test whether data are likely to be fabricated, but the efficacy of this approach depends on the sophistication of the fabrication, and that question is better suited to CrossValidated.



          If their work is meaningful in the field, at some point someone will implement their approach again. There is necessarily a bit of trust in science that people do what they say they've done. A reviewer should check that a paper's conclusions follow logically from their results, and that their methodology is described and sound, but they need not verify or replicate the results. As David Thornley points out in a comment on another answer, even simply running code tells you very little about those key factors, it only tells you if there is a blatant lie or mistake.






          share|improve this answer















          Do you have reason to doubt their veracity or good faith? Are their claims somehow not believable or highly questionable based on your knowledge of the field? If it isn't standard in your field to release code I don't think a reviewer should necessarily demand it, regardless of your feelings about making code public.



          The authors should describe their methodology sufficiently for someone else to replicate it; in that way, they are putting their reputations at risk that were someone to duplicate their approach they would find the same results. Fabricating results is a very serious accusation. There are some statistical approaches to test whether data are likely to be fabricated, but the efficacy of this approach depends on the sophistication of the fabrication, and that question is better suited to CrossValidated.



          If their work is meaningful in the field, at some point someone will implement their approach again. There is necessarily a bit of trust in science that people do what they say they've done. A reviewer should check that a paper's conclusions follow logically from their results, and that their methodology is described and sound, but they need not verify or replicate the results. As David Thornley points out in a comment on another answer, even simply running code tells you very little about those key factors, it only tells you if there is a blatant lie or mistake.







          share|improve this answer














          share|improve this answer



          share|improve this answer








          edited 2 days ago

























          answered 2 days ago









          Bryan KrauseBryan Krause

          12.2k13759




          12.2k13759








          • 22





            Do you have reason to doubt their claims? Surely that's the job of any researcher?

            – user2768
            2 days ago






          • 20





            @user2768 Of course I mean beyond basic skepticism. You should doubt their approach as presented: if they say we tested X by doing Y and you know Y is not the correct way to test X, that's a different kind of doubt than them saying we tested X by doing Y and you wondering if they actually just made it up instead of ever doing Y. The standard is to provide enough information for someone else to replicate; if someone was making a truly remarkable claim, there is more reason to demand to see their code than if they are showing an incremental improvement.

            – Bryan Krause
            2 days ago








          • 9





            Sometimes, even if the code is made available and the reviewer is willing to check it for correctness, it may be impractical to rerun the simulation. In my field, it's not uncommon for papers to report millions of CPU hours worth of simulation results. Surely a reviewer cannot be expected to spend a significant fraction of their yearly CPU time quota every time they review a paper. So I agree that a reasonable degree of trust is necessary.

            – Miguel
            2 days ago








          • 4





            @Miguel (Going OT) With such CPU-intensive computations, is it possible to generate a proof that the computations were performed correctly? (Bryan Krause and Phil Miller, nice revision.)

            – user2768
            yesterday






          • 3





            @user2768 It depends. The best benchmark is experiment, but it may not be possible to do it. The worst case scenario is that the simulation is "correct" but the underlying physical model is not. If the problem is important/interesting enough someone will try to reproduce the results using their own implementation or physical model (as others have mentioned here already). Generally, one may test e.g. a new interatomic potential by reproducing known experimental properties (density, self-diffusion coefficient) and then use said potential to predict unknown properties (e.g. reaction barriers).

            – Miguel
            yesterday














          • 22





            Do you have reason to doubt their claims? Surely that's the job of any researcher?

            – user2768
            2 days ago






          • 20





            @user2768 Of course I mean beyond basic skepticism. You should doubt their approach as presented: if they say we tested X by doing Y and you know Y is not the correct way to test X, that's a different kind of doubt than them saying we tested X by doing Y and you wondering if they actually just made it up instead of ever doing Y. The standard is to provide enough information for someone else to replicate; if someone was making a truly remarkable claim, there is more reason to demand to see their code than if they are showing an incremental improvement.

            – Bryan Krause
            2 days ago








          • 9





            Sometimes, even if the code is made available and the reviewer is willing to check it for correctness, it may be impractical to rerun the simulation. In my field, it's not uncommon for papers to report millions of CPU hours worth of simulation results. Surely a reviewer cannot be expected to spend a significant fraction of their yearly CPU time quota every time they review a paper. So I agree that a reasonable degree of trust is necessary.

            – Miguel
            2 days ago








          • 4





            @Miguel (Going OT) With such CPU-intensive computations, is it possible to generate a proof that the computations were performed correctly? (Bryan Krause and Phil Miller, nice revision.)

            – user2768
            yesterday






          • 3





            @user2768 It depends. The best benchmark is experiment, but it may not be possible to do it. The worst case scenario is that the simulation is "correct" but the underlying physical model is not. If the problem is important/interesting enough someone will try to reproduce the results using their own implementation or physical model (as others have mentioned here already). Generally, one may test e.g. a new interatomic potential by reproducing known experimental properties (density, self-diffusion coefficient) and then use said potential to predict unknown properties (e.g. reaction barriers).

            – Miguel
            yesterday








          22




          22





          Do you have reason to doubt their claims? Surely that's the job of any researcher?

          – user2768
          2 days ago





          Do you have reason to doubt their claims? Surely that's the job of any researcher?

          – user2768
          2 days ago




          20




          20





          @user2768 Of course I mean beyond basic skepticism. You should doubt their approach as presented: if they say we tested X by doing Y and you know Y is not the correct way to test X, that's a different kind of doubt than them saying we tested X by doing Y and you wondering if they actually just made it up instead of ever doing Y. The standard is to provide enough information for someone else to replicate; if someone was making a truly remarkable claim, there is more reason to demand to see their code than if they are showing an incremental improvement.

          – Bryan Krause
          2 days ago







          @user2768 Of course I mean beyond basic skepticism. You should doubt their approach as presented: if they say we tested X by doing Y and you know Y is not the correct way to test X, that's a different kind of doubt than them saying we tested X by doing Y and you wondering if they actually just made it up instead of ever doing Y. The standard is to provide enough information for someone else to replicate; if someone was making a truly remarkable claim, there is more reason to demand to see their code than if they are showing an incremental improvement.

          – Bryan Krause
          2 days ago






          9




          9





          Sometimes, even if the code is made available and the reviewer is willing to check it for correctness, it may be impractical to rerun the simulation. In my field, it's not uncommon for papers to report millions of CPU hours worth of simulation results. Surely a reviewer cannot be expected to spend a significant fraction of their yearly CPU time quota every time they review a paper. So I agree that a reasonable degree of trust is necessary.

          – Miguel
          2 days ago







          Sometimes, even if the code is made available and the reviewer is willing to check it for correctness, it may be impractical to rerun the simulation. In my field, it's not uncommon for papers to report millions of CPU hours worth of simulation results. Surely a reviewer cannot be expected to spend a significant fraction of their yearly CPU time quota every time they review a paper. So I agree that a reasonable degree of trust is necessary.

          – Miguel
          2 days ago






          4




          4





          @Miguel (Going OT) With such CPU-intensive computations, is it possible to generate a proof that the computations were performed correctly? (Bryan Krause and Phil Miller, nice revision.)

          – user2768
          yesterday





          @Miguel (Going OT) With such CPU-intensive computations, is it possible to generate a proof that the computations were performed correctly? (Bryan Krause and Phil Miller, nice revision.)

          – user2768
          yesterday




          3




          3





          @user2768 It depends. The best benchmark is experiment, but it may not be possible to do it. The worst case scenario is that the simulation is "correct" but the underlying physical model is not. If the problem is important/interesting enough someone will try to reproduce the results using their own implementation or physical model (as others have mentioned here already). Generally, one may test e.g. a new interatomic potential by reproducing known experimental properties (density, self-diffusion coefficient) and then use said potential to predict unknown properties (e.g. reaction barriers).

          – Miguel
          yesterday





          @user2768 It depends. The best benchmark is experiment, but it may not be possible to do it. The worst case scenario is that the simulation is "correct" but the underlying physical model is not. If the problem is important/interesting enough someone will try to reproduce the results using their own implementation or physical model (as others have mentioned here already). Generally, one may test e.g. a new interatomic potential by reproducing known experimental properties (density, self-diffusion coefficient) and then use said potential to predict unknown properties (e.g. reaction barriers).

          – Miguel
          yesterday











          20














          You can't really judge if the simulation was really performed. That's why we've had things such as the Schön scandal - the reviewers of those manuscripts didn't detect the fraud either.



          What you can do is implement the "smell test". Is this approach feasible? Are the results reasonable? Were there any glaring omissions? If you can't see any obvious problems with the simulation, that's good enough: the real peer review happens after publication.






          share|improve this answer




























            20














            You can't really judge if the simulation was really performed. That's why we've had things such as the Schön scandal - the reviewers of those manuscripts didn't detect the fraud either.



            What you can do is implement the "smell test". Is this approach feasible? Are the results reasonable? Were there any glaring omissions? If you can't see any obvious problems with the simulation, that's good enough: the real peer review happens after publication.






            share|improve this answer


























              20












              20








              20







              You can't really judge if the simulation was really performed. That's why we've had things such as the Schön scandal - the reviewers of those manuscripts didn't detect the fraud either.



              What you can do is implement the "smell test". Is this approach feasible? Are the results reasonable? Were there any glaring omissions? If you can't see any obvious problems with the simulation, that's good enough: the real peer review happens after publication.






              share|improve this answer













              You can't really judge if the simulation was really performed. That's why we've had things such as the Schön scandal - the reviewers of those manuscripts didn't detect the fraud either.



              What you can do is implement the "smell test". Is this approach feasible? Are the results reasonable? Were there any glaring omissions? If you can't see any obvious problems with the simulation, that's good enough: the real peer review happens after publication.







              share|improve this answer












              share|improve this answer



              share|improve this answer










              answered 2 days ago









              AllureAllure

              27.7k1483137




              27.7k1483137























                  14















                  After all these experiences, I can tell you there is no way to confirm the correctness of simulation results.




                  That is not necessarily true. In some cases, it is very easy to discern that a graph cannot possibly be correct or at the least has been badly misconstrued or misinterpreted. I had such a mistake caught in one of my early papers and have caught them in several papers I have reviewed.



                  It is not easy to prove that the simulations have actually been performed. However, the Open Science framework is designed to make it easier to verify results of both computational and experimental work.






                  share|improve this answer





















                  • 4





                    Your experience is about confirming incorrectness. It is still hard to confirm correctness.

                    – norio
                    yesterday






                  • 3





                    This looks more like a comment on one sentence of the question, rather than an answer to the main question.

                    – D.W.
                    yesterday
















                  14















                  After all these experiences, I can tell you there is no way to confirm the correctness of simulation results.




                  That is not necessarily true. In some cases, it is very easy to discern that a graph cannot possibly be correct or at the least has been badly misconstrued or misinterpreted. I had such a mistake caught in one of my early papers and have caught them in several papers I have reviewed.



                  It is not easy to prove that the simulations have actually been performed. However, the Open Science framework is designed to make it easier to verify results of both computational and experimental work.






                  share|improve this answer





















                  • 4





                    Your experience is about confirming incorrectness. It is still hard to confirm correctness.

                    – norio
                    yesterday






                  • 3





                    This looks more like a comment on one sentence of the question, rather than an answer to the main question.

                    – D.W.
                    yesterday














                  14












                  14








                  14








                  After all these experiences, I can tell you there is no way to confirm the correctness of simulation results.




                  That is not necessarily true. In some cases, it is very easy to discern that a graph cannot possibly be correct or at the least has been badly misconstrued or misinterpreted. I had such a mistake caught in one of my early papers and have caught them in several papers I have reviewed.



                  It is not easy to prove that the simulations have actually been performed. However, the Open Science framework is designed to make it easier to verify results of both computational and experimental work.






                  share|improve this answer
















                  After all these experiences, I can tell you there is no way to confirm the correctness of simulation results.




                  That is not necessarily true. In some cases, it is very easy to discern that a graph cannot possibly be correct or at the least has been badly misconstrued or misinterpreted. I had such a mistake caught in one of my early papers and have caught them in several papers I have reviewed.



                  It is not easy to prove that the simulations have actually been performed. However, the Open Science framework is designed to make it easier to verify results of both computational and experimental work.







                  share|improve this answer














                  share|improve this answer



                  share|improve this answer








                  edited 2 days ago

























                  answered 2 days ago









                  aeismailaeismail

                  159k31373695




                  159k31373695








                  • 4





                    Your experience is about confirming incorrectness. It is still hard to confirm correctness.

                    – norio
                    yesterday






                  • 3





                    This looks more like a comment on one sentence of the question, rather than an answer to the main question.

                    – D.W.
                    yesterday














                  • 4





                    Your experience is about confirming incorrectness. It is still hard to confirm correctness.

                    – norio
                    yesterday






                  • 3





                    This looks more like a comment on one sentence of the question, rather than an answer to the main question.

                    – D.W.
                    yesterday








                  4




                  4





                  Your experience is about confirming incorrectness. It is still hard to confirm correctness.

                  – norio
                  yesterday





                  Your experience is about confirming incorrectness. It is still hard to confirm correctness.

                  – norio
                  yesterday




                  3




                  3





                  This looks more like a comment on one sentence of the question, rather than an answer to the main question.

                  – D.W.
                  yesterday





                  This looks more like a comment on one sentence of the question, rather than an answer to the main question.

                  – D.W.
                  yesterday











                  7















                  there is no way to confirm the correctness of simulation results.




                  Simulations should be repeatable, hence, correctness can be checked by re-running the simulation. Of course, the authors might not provide the necessary code, but then you can request the code as a part of the review process.






                  share|improve this answer



















                  • 4





                    @MBK I don't recommend implementing; I recommend repeating. If the authors won't let you repeat (by denying access to code), then I'd be inclined to reject, but I'd consult with the editor.

                    – user2768
                    2 days ago








                  • 10





                    @user2768 So, if I were to run some simulations using software I didn't have a license to redistribute, I shouldn't be able to publish my results?

                    – David Thornley
                    2 days ago






                  • 14





                    @MBK Just running the same code on the same data tells you very little without actually examining the code to make sure it implements the algorithm(s) of the paper. It tells you that the author(s) didn't outright lie about the results, and that's all.

                    – David Thornley
                    2 days ago






                  • 7





                    @SylvainRibault In many areas of research it isn't possible, for various reasons, to share all the raw data involved. Should none of that research be published either? Should we save and distribute blood samples to anyone that wants to verify the results of a study of inflammatory biomarkers? What if the process of analysis destroys the sample? Trust is an integral part of academic research.

                    – Bryan Krause
                    2 days ago






                  • 4





                    @SylvainRibault Understood, but although you can argue in favor of open source software there can be good reasons to not distribute code freely; I think it's short-sighted to make that a hard "no" for publishing them, and pointing out that that level of replication is impossible in other fields that have no problem publishing them.

                    – Bryan Krause
                    2 days ago
















                  7















                  there is no way to confirm the correctness of simulation results.




                  Simulations should be repeatable, hence, correctness can be checked by re-running the simulation. Of course, the authors might not provide the necessary code, but then you can request the code as a part of the review process.






                  share|improve this answer



















                  • 4





                    @MBK I don't recommend implementing; I recommend repeating. If the authors won't let you repeat (by denying access to code), then I'd be inclined to reject, but I'd consult with the editor.

                    – user2768
                    2 days ago








                  • 10





                    @user2768 So, if I were to run some simulations using software I didn't have a license to redistribute, I shouldn't be able to publish my results?

                    – David Thornley
                    2 days ago






                  • 14





                    @MBK Just running the same code on the same data tells you very little without actually examining the code to make sure it implements the algorithm(s) of the paper. It tells you that the author(s) didn't outright lie about the results, and that's all.

                    – David Thornley
                    2 days ago






                  • 7





                    @SylvainRibault In many areas of research it isn't possible, for various reasons, to share all the raw data involved. Should none of that research be published either? Should we save and distribute blood samples to anyone that wants to verify the results of a study of inflammatory biomarkers? What if the process of analysis destroys the sample? Trust is an integral part of academic research.

                    – Bryan Krause
                    2 days ago






                  • 4





                    @SylvainRibault Understood, but although you can argue in favor of open source software there can be good reasons to not distribute code freely; I think it's short-sighted to make that a hard "no" for publishing them, and pointing out that that level of replication is impossible in other fields that have no problem publishing them.

                    – Bryan Krause
                    2 days ago














                  7












                  7








                  7








                  there is no way to confirm the correctness of simulation results.




                  Simulations should be repeatable, hence, correctness can be checked by re-running the simulation. Of course, the authors might not provide the necessary code, but then you can request the code as a part of the review process.






                  share|improve this answer














                  there is no way to confirm the correctness of simulation results.




                  Simulations should be repeatable, hence, correctness can be checked by re-running the simulation. Of course, the authors might not provide the necessary code, but then you can request the code as a part of the review process.







                  share|improve this answer












                  share|improve this answer



                  share|improve this answer










                  answered 2 days ago









                  user2768user2768

                  12k23053




                  12k23053








                  • 4





                    @MBK I don't recommend implementing; I recommend repeating. If the authors won't let you repeat (by denying access to code), then I'd be inclined to reject, but I'd consult with the editor.

                    – user2768
                    2 days ago








                  • 10





                    @user2768 So, if I were to run some simulations using software I didn't have a license to redistribute, I shouldn't be able to publish my results?

                    – David Thornley
                    2 days ago






                  • 14





                    @MBK Just running the same code on the same data tells you very little without actually examining the code to make sure it implements the algorithm(s) of the paper. It tells you that the author(s) didn't outright lie about the results, and that's all.

                    – David Thornley
                    2 days ago






                  • 7





                    @SylvainRibault In many areas of research it isn't possible, for various reasons, to share all the raw data involved. Should none of that research be published either? Should we save and distribute blood samples to anyone that wants to verify the results of a study of inflammatory biomarkers? What if the process of analysis destroys the sample? Trust is an integral part of academic research.

                    – Bryan Krause
                    2 days ago






                  • 4





                    @SylvainRibault Understood, but although you can argue in favor of open source software there can be good reasons to not distribute code freely; I think it's short-sighted to make that a hard "no" for publishing them, and pointing out that that level of replication is impossible in other fields that have no problem publishing them.

                    – Bryan Krause
                    2 days ago














                  • 4





                    @MBK I don't recommend implementing; I recommend repeating. If the authors won't let you repeat (by denying access to code), then I'd be inclined to reject, but I'd consult with the editor.

                    – user2768
                    2 days ago








                  • 10





                    @user2768 So, if I were to run some simulations using software I didn't have a license to redistribute, I shouldn't be able to publish my results?

                    – David Thornley
                    2 days ago






                  • 14





                    @MBK Just running the same code on the same data tells you very little without actually examining the code to make sure it implements the algorithm(s) of the paper. It tells you that the author(s) didn't outright lie about the results, and that's all.

                    – David Thornley
                    2 days ago






                  • 7





                    @SylvainRibault In many areas of research it isn't possible, for various reasons, to share all the raw data involved. Should none of that research be published either? Should we save and distribute blood samples to anyone that wants to verify the results of a study of inflammatory biomarkers? What if the process of analysis destroys the sample? Trust is an integral part of academic research.

                    – Bryan Krause
                    2 days ago






                  • 4





                    @SylvainRibault Understood, but although you can argue in favor of open source software there can be good reasons to not distribute code freely; I think it's short-sighted to make that a hard "no" for publishing them, and pointing out that that level of replication is impossible in other fields that have no problem publishing them.

                    – Bryan Krause
                    2 days ago








                  4




                  4





                  @MBK I don't recommend implementing; I recommend repeating. If the authors won't let you repeat (by denying access to code), then I'd be inclined to reject, but I'd consult with the editor.

                  – user2768
                  2 days ago







                  @MBK I don't recommend implementing; I recommend repeating. If the authors won't let you repeat (by denying access to code), then I'd be inclined to reject, but I'd consult with the editor.

                  – user2768
                  2 days ago






                  10




                  10





                  @user2768 So, if I were to run some simulations using software I didn't have a license to redistribute, I shouldn't be able to publish my results?

                  – David Thornley
                  2 days ago





                  @user2768 So, if I were to run some simulations using software I didn't have a license to redistribute, I shouldn't be able to publish my results?

                  – David Thornley
                  2 days ago




                  14




                  14





                  @MBK Just running the same code on the same data tells you very little without actually examining the code to make sure it implements the algorithm(s) of the paper. It tells you that the author(s) didn't outright lie about the results, and that's all.

                  – David Thornley
                  2 days ago





                  @MBK Just running the same code on the same data tells you very little without actually examining the code to make sure it implements the algorithm(s) of the paper. It tells you that the author(s) didn't outright lie about the results, and that's all.

                  – David Thornley
                  2 days ago




                  7




                  7





                  @SylvainRibault In many areas of research it isn't possible, for various reasons, to share all the raw data involved. Should none of that research be published either? Should we save and distribute blood samples to anyone that wants to verify the results of a study of inflammatory biomarkers? What if the process of analysis destroys the sample? Trust is an integral part of academic research.

                  – Bryan Krause
                  2 days ago





                  @SylvainRibault In many areas of research it isn't possible, for various reasons, to share all the raw data involved. Should none of that research be published either? Should we save and distribute blood samples to anyone that wants to verify the results of a study of inflammatory biomarkers? What if the process of analysis destroys the sample? Trust is an integral part of academic research.

                  – Bryan Krause
                  2 days ago




                  4




                  4





                  @SylvainRibault Understood, but although you can argue in favor of open source software there can be good reasons to not distribute code freely; I think it's short-sighted to make that a hard "no" for publishing them, and pointing out that that level of replication is impossible in other fields that have no problem publishing them.

                  – Bryan Krause
                  2 days ago





                  @SylvainRibault Understood, but although you can argue in favor of open source software there can be good reasons to not distribute code freely; I think it's short-sighted to make that a hard "no" for publishing them, and pointing out that that level of replication is impossible in other fields that have no problem publishing them.

                  – Bryan Krause
                  2 days ago











                  4














                  If I were reviewing a paper which relied to a heavy degree on some computational analysis, but didn't supply the code, I would reject that paper unless the authors could give a good reason: even if they'd used proprietary libraries, I'd want to see the calls they made to those libraries. This is required by many journals in my field (genomics/bioinformatics).



                  That said, if the simulations ran for 2 months on 10,000 cores (or even a week on 200 cores), there is not much chance of me reproducing even with the code. And I almost certainly have neither the time, moneny nor expertise to repeat lab experiments in any paper I read.



                  I don't think that providing code, although a good practice, is a protection against dis-honesty. In the end there is very little protection against outright fraud, and the review process is not primarily there for that purpose.






                  share|improve this answer
























                  • There might not be a chance of reproducing it now, but consider that the paper may still be of interest in a decade or two.

                    – Trusly
                    23 hours ago






                  • 1





                    The chance of any underlying infrastructure in two decades being compatible with code written now is fairly minimal, unless its packaged as a virtual machine. I'd be skeptical that even a docker image build now will run the same in 20 years time. This is a real and on going problem. See software.ac.uk for an example of people working hard to try on solve this non-trival issue.

                    – Ian Sudbery
                    20 hours ago
















                  4














                  If I were reviewing a paper which relied to a heavy degree on some computational analysis, but didn't supply the code, I would reject that paper unless the authors could give a good reason: even if they'd used proprietary libraries, I'd want to see the calls they made to those libraries. This is required by many journals in my field (genomics/bioinformatics).



                  That said, if the simulations ran for 2 months on 10,000 cores (or even a week on 200 cores), there is not much chance of me reproducing even with the code. And I almost certainly have neither the time, moneny nor expertise to repeat lab experiments in any paper I read.



                  I don't think that providing code, although a good practice, is a protection against dis-honesty. In the end there is very little protection against outright fraud, and the review process is not primarily there for that purpose.






                  share|improve this answer
























                  • There might not be a chance of reproducing it now, but consider that the paper may still be of interest in a decade or two.

                    – Trusly
                    23 hours ago






                  • 1





                    The chance of any underlying infrastructure in two decades being compatible with code written now is fairly minimal, unless its packaged as a virtual machine. I'd be skeptical that even a docker image build now will run the same in 20 years time. This is a real and on going problem. See software.ac.uk for an example of people working hard to try on solve this non-trival issue.

                    – Ian Sudbery
                    20 hours ago














                  4












                  4








                  4







                  If I were reviewing a paper which relied to a heavy degree on some computational analysis, but didn't supply the code, I would reject that paper unless the authors could give a good reason: even if they'd used proprietary libraries, I'd want to see the calls they made to those libraries. This is required by many journals in my field (genomics/bioinformatics).



                  That said, if the simulations ran for 2 months on 10,000 cores (or even a week on 200 cores), there is not much chance of me reproducing even with the code. And I almost certainly have neither the time, moneny nor expertise to repeat lab experiments in any paper I read.



                  I don't think that providing code, although a good practice, is a protection against dis-honesty. In the end there is very little protection against outright fraud, and the review process is not primarily there for that purpose.






                  share|improve this answer













                  If I were reviewing a paper which relied to a heavy degree on some computational analysis, but didn't supply the code, I would reject that paper unless the authors could give a good reason: even if they'd used proprietary libraries, I'd want to see the calls they made to those libraries. This is required by many journals in my field (genomics/bioinformatics).



                  That said, if the simulations ran for 2 months on 10,000 cores (or even a week on 200 cores), there is not much chance of me reproducing even with the code. And I almost certainly have neither the time, moneny nor expertise to repeat lab experiments in any paper I read.



                  I don't think that providing code, although a good practice, is a protection against dis-honesty. In the end there is very little protection against outright fraud, and the review process is not primarily there for that purpose.







                  share|improve this answer












                  share|improve this answer



                  share|improve this answer










                  answered yesterday









                  Ian SudberyIan Sudbery

                  5,1871420




                  5,1871420













                  • There might not be a chance of reproducing it now, but consider that the paper may still be of interest in a decade or two.

                    – Trusly
                    23 hours ago






                  • 1





                    The chance of any underlying infrastructure in two decades being compatible with code written now is fairly minimal, unless its packaged as a virtual machine. I'd be skeptical that even a docker image build now will run the same in 20 years time. This is a real and on going problem. See software.ac.uk for an example of people working hard to try on solve this non-trival issue.

                    – Ian Sudbery
                    20 hours ago



















                  • There might not be a chance of reproducing it now, but consider that the paper may still be of interest in a decade or two.

                    – Trusly
                    23 hours ago






                  • 1





                    The chance of any underlying infrastructure in two decades being compatible with code written now is fairly minimal, unless its packaged as a virtual machine. I'd be skeptical that even a docker image build now will run the same in 20 years time. This is a real and on going problem. See software.ac.uk for an example of people working hard to try on solve this non-trival issue.

                    – Ian Sudbery
                    20 hours ago

















                  There might not be a chance of reproducing it now, but consider that the paper may still be of interest in a decade or two.

                  – Trusly
                  23 hours ago





                  There might not be a chance of reproducing it now, but consider that the paper may still be of interest in a decade or two.

                  – Trusly
                  23 hours ago




                  1




                  1





                  The chance of any underlying infrastructure in two decades being compatible with code written now is fairly minimal, unless its packaged as a virtual machine. I'd be skeptical that even a docker image build now will run the same in 20 years time. This is a real and on going problem. See software.ac.uk for an example of people working hard to try on solve this non-trival issue.

                  – Ian Sudbery
                  20 hours ago





                  The chance of any underlying infrastructure in two decades being compatible with code written now is fairly minimal, unless its packaged as a virtual machine. I'd be skeptical that even a docker image build now will run the same in 20 years time. This is a real and on going problem. See software.ac.uk for an example of people working hard to try on solve this non-trival issue.

                  – Ian Sudbery
                  20 hours ago











                  4














                  I end up reviewing data science type papers where the underlying code is critical. I've started being that guy during reviews, and this is what I ask for:




                  1. Code must be available to me as a reviewer. Full stop.

                  2. The code needs to have tests that I as a reviewer can run. I can't go through the code to check and make sure it works exactly as advertised, but I can see if the tests are appropriate and that they pass when I run it.

                  3. The code should have reasonably good test coverage around the scientific parts, and regression tests should be reasonably justified (i.e. We expect to see this result because of reason X).

                  4. If you've re-invented a wheel you have to explain why (even if that explanation is that you didn't know someone else already made it)


                  These all seem pretty reasonable to me and address a lot of the concerns about code quality. Most people with well designed code will hit them all just by virtue of having continuous integration tests on a GitHub repository, and I think it's no longer necessary to coddle people with poorly designed code (which is too many people).






                  share|improve this answer



















                  • 1





                    I don't see how this answers the question. The question was, how can we confirm if the simulation results are correct without the code, and your answer is "I require the code".

                    – D.W.
                    yesterday






                  • 1





                    If you're a reviewer, "I require the code" is absolutely an answer that you can give. It's a reasonable request, even if you can't use that code to fully reproduce the simulation or analysis (which isn't your job; but making sure that the paper has a baseline level of plausibility is). It's no different from asking for better experimental controls from a biochemist.

                    – CJ59
                    yesterday











                  • @CJ59 how are you defining "poorly designed code"?

                    – Underminer
                    yesterday











                  • If you can write tests that cover most of the code base and that cover most inputs that are reasonable for your problem, your code is probably well-designed. If your tests are tortured and illogical, then your code is probably poorly designed. If you're living in the year 2019 and not bothering to write tests for the code that you're submitting along with your scientific paper, then you are a poorly designed person.

                    – CJ59
                    yesterday











                  • Yes, that's something you can demand as a reviewer -- but it doesn't make it an answer to this question.

                    – D.W.
                    yesterday
















                  4














                  I end up reviewing data science type papers where the underlying code is critical. I've started being that guy during reviews, and this is what I ask for:




                  1. Code must be available to me as a reviewer. Full stop.

                  2. The code needs to have tests that I as a reviewer can run. I can't go through the code to check and make sure it works exactly as advertised, but I can see if the tests are appropriate and that they pass when I run it.

                  3. The code should have reasonably good test coverage around the scientific parts, and regression tests should be reasonably justified (i.e. We expect to see this result because of reason X).

                  4. If you've re-invented a wheel you have to explain why (even if that explanation is that you didn't know someone else already made it)


                  These all seem pretty reasonable to me and address a lot of the concerns about code quality. Most people with well designed code will hit them all just by virtue of having continuous integration tests on a GitHub repository, and I think it's no longer necessary to coddle people with poorly designed code (which is too many people).






                  share|improve this answer



















                  • 1





                    I don't see how this answers the question. The question was, how can we confirm if the simulation results are correct without the code, and your answer is "I require the code".

                    – D.W.
                    yesterday






                  • 1





                    If you're a reviewer, "I require the code" is absolutely an answer that you can give. It's a reasonable request, even if you can't use that code to fully reproduce the simulation or analysis (which isn't your job; but making sure that the paper has a baseline level of plausibility is). It's no different from asking for better experimental controls from a biochemist.

                    – CJ59
                    yesterday











                  • @CJ59 how are you defining "poorly designed code"?

                    – Underminer
                    yesterday











                  • If you can write tests that cover most of the code base and that cover most inputs that are reasonable for your problem, your code is probably well-designed. If your tests are tortured and illogical, then your code is probably poorly designed. If you're living in the year 2019 and not bothering to write tests for the code that you're submitting along with your scientific paper, then you are a poorly designed person.

                    – CJ59
                    yesterday











                  • Yes, that's something you can demand as a reviewer -- but it doesn't make it an answer to this question.

                    – D.W.
                    yesterday














                  4












                  4








                  4







                  I end up reviewing data science type papers where the underlying code is critical. I've started being that guy during reviews, and this is what I ask for:




                  1. Code must be available to me as a reviewer. Full stop.

                  2. The code needs to have tests that I as a reviewer can run. I can't go through the code to check and make sure it works exactly as advertised, but I can see if the tests are appropriate and that they pass when I run it.

                  3. The code should have reasonably good test coverage around the scientific parts, and regression tests should be reasonably justified (i.e. We expect to see this result because of reason X).

                  4. If you've re-invented a wheel you have to explain why (even if that explanation is that you didn't know someone else already made it)


                  These all seem pretty reasonable to me and address a lot of the concerns about code quality. Most people with well designed code will hit them all just by virtue of having continuous integration tests on a GitHub repository, and I think it's no longer necessary to coddle people with poorly designed code (which is too many people).






                  share|improve this answer













                  I end up reviewing data science type papers where the underlying code is critical. I've started being that guy during reviews, and this is what I ask for:




                  1. Code must be available to me as a reviewer. Full stop.

                  2. The code needs to have tests that I as a reviewer can run. I can't go through the code to check and make sure it works exactly as advertised, but I can see if the tests are appropriate and that they pass when I run it.

                  3. The code should have reasonably good test coverage around the scientific parts, and regression tests should be reasonably justified (i.e. We expect to see this result because of reason X).

                  4. If you've re-invented a wheel you have to explain why (even if that explanation is that you didn't know someone else already made it)


                  These all seem pretty reasonable to me and address a lot of the concerns about code quality. Most people with well designed code will hit them all just by virtue of having continuous integration tests on a GitHub repository, and I think it's no longer necessary to coddle people with poorly designed code (which is too many people).







                  share|improve this answer












                  share|improve this answer



                  share|improve this answer










                  answered yesterday









                  CJ59CJ59

                  1865




                  1865








                  • 1





                    I don't see how this answers the question. The question was, how can we confirm if the simulation results are correct without the code, and your answer is "I require the code".

                    – D.W.
                    yesterday






                  • 1





                    If you're a reviewer, "I require the code" is absolutely an answer that you can give. It's a reasonable request, even if you can't use that code to fully reproduce the simulation or analysis (which isn't your job; but making sure that the paper has a baseline level of plausibility is). It's no different from asking for better experimental controls from a biochemist.

                    – CJ59
                    yesterday











                  • @CJ59 how are you defining "poorly designed code"?

                    – Underminer
                    yesterday











                  • If you can write tests that cover most of the code base and that cover most inputs that are reasonable for your problem, your code is probably well-designed. If your tests are tortured and illogical, then your code is probably poorly designed. If you're living in the year 2019 and not bothering to write tests for the code that you're submitting along with your scientific paper, then you are a poorly designed person.

                    – CJ59
                    yesterday











                  • Yes, that's something you can demand as a reviewer -- but it doesn't make it an answer to this question.

                    – D.W.
                    yesterday














                  • 1





                    I don't see how this answers the question. The question was, how can we confirm if the simulation results are correct without the code, and your answer is "I require the code".

                    – D.W.
                    yesterday






                  • 1





                    If you're a reviewer, "I require the code" is absolutely an answer that you can give. It's a reasonable request, even if you can't use that code to fully reproduce the simulation or analysis (which isn't your job; but making sure that the paper has a baseline level of plausibility is). It's no different from asking for better experimental controls from a biochemist.

                    – CJ59
                    yesterday











                  • @CJ59 how are you defining "poorly designed code"?

                    – Underminer
                    yesterday











                  • If you can write tests that cover most of the code base and that cover most inputs that are reasonable for your problem, your code is probably well-designed. If your tests are tortured and illogical, then your code is probably poorly designed. If you're living in the year 2019 and not bothering to write tests for the code that you're submitting along with your scientific paper, then you are a poorly designed person.

                    – CJ59
                    yesterday











                  • Yes, that's something you can demand as a reviewer -- but it doesn't make it an answer to this question.

                    – D.W.
                    yesterday








                  1




                  1





                  I don't see how this answers the question. The question was, how can we confirm if the simulation results are correct without the code, and your answer is "I require the code".

                  – D.W.
                  yesterday





                  I don't see how this answers the question. The question was, how can we confirm if the simulation results are correct without the code, and your answer is "I require the code".

                  – D.W.
                  yesterday




                  1




                  1





                  If you're a reviewer, "I require the code" is absolutely an answer that you can give. It's a reasonable request, even if you can't use that code to fully reproduce the simulation or analysis (which isn't your job; but making sure that the paper has a baseline level of plausibility is). It's no different from asking for better experimental controls from a biochemist.

                  – CJ59
                  yesterday





                  If you're a reviewer, "I require the code" is absolutely an answer that you can give. It's a reasonable request, even if you can't use that code to fully reproduce the simulation or analysis (which isn't your job; but making sure that the paper has a baseline level of plausibility is). It's no different from asking for better experimental controls from a biochemist.

                  – CJ59
                  yesterday













                  @CJ59 how are you defining "poorly designed code"?

                  – Underminer
                  yesterday





                  @CJ59 how are you defining "poorly designed code"?

                  – Underminer
                  yesterday













                  If you can write tests that cover most of the code base and that cover most inputs that are reasonable for your problem, your code is probably well-designed. If your tests are tortured and illogical, then your code is probably poorly designed. If you're living in the year 2019 and not bothering to write tests for the code that you're submitting along with your scientific paper, then you are a poorly designed person.

                  – CJ59
                  yesterday





                  If you can write tests that cover most of the code base and that cover most inputs that are reasonable for your problem, your code is probably well-designed. If your tests are tortured and illogical, then your code is probably poorly designed. If you're living in the year 2019 and not bothering to write tests for the code that you're submitting along with your scientific paper, then you are a poorly designed person.

                  – CJ59
                  yesterday













                  Yes, that's something you can demand as a reviewer -- but it doesn't make it an answer to this question.

                  – D.W.
                  yesterday





                  Yes, that's something you can demand as a reviewer -- but it doesn't make it an answer to this question.

                  – D.W.
                  yesterday











                  4














                  In order for simulation work to be precisely reproducible, it would be necessary to have (a) the code for the simulation and (b) the seed for the pseudoramdon generator used to run the code. Unless the code is proprietary, there is no good reason for authors to withhold this information from reviewers, even if the code will not be published as part of the paper. However, publishable simulation studies may be so extensive that it is not feasible even for an energetic skeptical reviewer to repeat the simulations.



                  To a degree many simulation studies can be self-verifying. When this is possible reviewers should insist on feasible inherent verification. In non-technical language here are a couple of examples of what I mean.



                  (1) Often a simulation will produce several results of which some are novel and some are easily obtained or generally known without resorting to simulation. Then at the very least a reviewer can can confirm that the latter results are valid. Somewhat similarly, simulations may refine results that can only be approximated by probabilistic or other mathematical computations. Then the reviewer can confirm that the results
                  are at least approximately correct.



                  (2) Very frequently, an important part of a simulation study may be to obtain approximate bounds within which the simulated results are likely (perhaps 95% likely) to lie. If it seems feasible to have obtained such bounds and the paper under review lacks them, then the reviewer should ask for them or for an explanation of their absence.



                  Addendum: This is a trivial example illustrating some concepts in (1) and (2). Suppose five dice are loaded so that faces 1 through 6 have respective probabilities (1/12, 1/6, 1/6, 1/6, 1/6, 1/4) of occurring. If all five are rolled, what is the probability that the total is at least 25? A simulation in R statistical software of a million such 5-die
                  experiments shows that the fraction of outcomes with totals 25 or more was 0.092903. Is this result believable? The answer is Yes, to about three places.



                  The simulated 95% margin of simulation error is "within 0.0006." It is easy to see that the average total is 19.583 and the corresponding simulated result is 19.580. A reasonable 2-place normal approximation is 0.0922.
                  This particular example is rich in corroborative
                  possibilities, but those are a few.



                  Note: Another issue is that, using various kinds of mathematical software, this problem could be solved exactly by advanced combinatorial methods. One exact method is based on this page, except that our dice are biased and outcomes are not equally likely. It is questionable whether simulations should be published if there is a tractable exact solution. One job of a reviewer is to identify papers that should not be published for this reason.






                  share|improve this answer






























                    4














                    In order for simulation work to be precisely reproducible, it would be necessary to have (a) the code for the simulation and (b) the seed for the pseudoramdon generator used to run the code. Unless the code is proprietary, there is no good reason for authors to withhold this information from reviewers, even if the code will not be published as part of the paper. However, publishable simulation studies may be so extensive that it is not feasible even for an energetic skeptical reviewer to repeat the simulations.



                    To a degree many simulation studies can be self-verifying. When this is possible reviewers should insist on feasible inherent verification. In non-technical language here are a couple of examples of what I mean.



                    (1) Often a simulation will produce several results of which some are novel and some are easily obtained or generally known without resorting to simulation. Then at the very least a reviewer can can confirm that the latter results are valid. Somewhat similarly, simulations may refine results that can only be approximated by probabilistic or other mathematical computations. Then the reviewer can confirm that the results
                    are at least approximately correct.



                    (2) Very frequently, an important part of a simulation study may be to obtain approximate bounds within which the simulated results are likely (perhaps 95% likely) to lie. If it seems feasible to have obtained such bounds and the paper under review lacks them, then the reviewer should ask for them or for an explanation of their absence.



                    Addendum: This is a trivial example illustrating some concepts in (1) and (2). Suppose five dice are loaded so that faces 1 through 6 have respective probabilities (1/12, 1/6, 1/6, 1/6, 1/6, 1/4) of occurring. If all five are rolled, what is the probability that the total is at least 25? A simulation in R statistical software of a million such 5-die
                    experiments shows that the fraction of outcomes with totals 25 or more was 0.092903. Is this result believable? The answer is Yes, to about three places.



                    The simulated 95% margin of simulation error is "within 0.0006." It is easy to see that the average total is 19.583 and the corresponding simulated result is 19.580. A reasonable 2-place normal approximation is 0.0922.
                    This particular example is rich in corroborative
                    possibilities, but those are a few.



                    Note: Another issue is that, using various kinds of mathematical software, this problem could be solved exactly by advanced combinatorial methods. One exact method is based on this page, except that our dice are biased and outcomes are not equally likely. It is questionable whether simulations should be published if there is a tractable exact solution. One job of a reviewer is to identify papers that should not be published for this reason.






                    share|improve this answer




























                      4












                      4








                      4







                      In order for simulation work to be precisely reproducible, it would be necessary to have (a) the code for the simulation and (b) the seed for the pseudoramdon generator used to run the code. Unless the code is proprietary, there is no good reason for authors to withhold this information from reviewers, even if the code will not be published as part of the paper. However, publishable simulation studies may be so extensive that it is not feasible even for an energetic skeptical reviewer to repeat the simulations.



                      To a degree many simulation studies can be self-verifying. When this is possible reviewers should insist on feasible inherent verification. In non-technical language here are a couple of examples of what I mean.



                      (1) Often a simulation will produce several results of which some are novel and some are easily obtained or generally known without resorting to simulation. Then at the very least a reviewer can can confirm that the latter results are valid. Somewhat similarly, simulations may refine results that can only be approximated by probabilistic or other mathematical computations. Then the reviewer can confirm that the results
                      are at least approximately correct.



                      (2) Very frequently, an important part of a simulation study may be to obtain approximate bounds within which the simulated results are likely (perhaps 95% likely) to lie. If it seems feasible to have obtained such bounds and the paper under review lacks them, then the reviewer should ask for them or for an explanation of their absence.



                      Addendum: This is a trivial example illustrating some concepts in (1) and (2). Suppose five dice are loaded so that faces 1 through 6 have respective probabilities (1/12, 1/6, 1/6, 1/6, 1/6, 1/4) of occurring. If all five are rolled, what is the probability that the total is at least 25? A simulation in R statistical software of a million such 5-die
                      experiments shows that the fraction of outcomes with totals 25 or more was 0.092903. Is this result believable? The answer is Yes, to about three places.



                      The simulated 95% margin of simulation error is "within 0.0006." It is easy to see that the average total is 19.583 and the corresponding simulated result is 19.580. A reasonable 2-place normal approximation is 0.0922.
                      This particular example is rich in corroborative
                      possibilities, but those are a few.



                      Note: Another issue is that, using various kinds of mathematical software, this problem could be solved exactly by advanced combinatorial methods. One exact method is based on this page, except that our dice are biased and outcomes are not equally likely. It is questionable whether simulations should be published if there is a tractable exact solution. One job of a reviewer is to identify papers that should not be published for this reason.






                      share|improve this answer















                      In order for simulation work to be precisely reproducible, it would be necessary to have (a) the code for the simulation and (b) the seed for the pseudoramdon generator used to run the code. Unless the code is proprietary, there is no good reason for authors to withhold this information from reviewers, even if the code will not be published as part of the paper. However, publishable simulation studies may be so extensive that it is not feasible even for an energetic skeptical reviewer to repeat the simulations.



                      To a degree many simulation studies can be self-verifying. When this is possible reviewers should insist on feasible inherent verification. In non-technical language here are a couple of examples of what I mean.



                      (1) Often a simulation will produce several results of which some are novel and some are easily obtained or generally known without resorting to simulation. Then at the very least a reviewer can can confirm that the latter results are valid. Somewhat similarly, simulations may refine results that can only be approximated by probabilistic or other mathematical computations. Then the reviewer can confirm that the results
                      are at least approximately correct.



                      (2) Very frequently, an important part of a simulation study may be to obtain approximate bounds within which the simulated results are likely (perhaps 95% likely) to lie. If it seems feasible to have obtained such bounds and the paper under review lacks them, then the reviewer should ask for them or for an explanation of their absence.



                      Addendum: This is a trivial example illustrating some concepts in (1) and (2). Suppose five dice are loaded so that faces 1 through 6 have respective probabilities (1/12, 1/6, 1/6, 1/6, 1/6, 1/4) of occurring. If all five are rolled, what is the probability that the total is at least 25? A simulation in R statistical software of a million such 5-die
                      experiments shows that the fraction of outcomes with totals 25 or more was 0.092903. Is this result believable? The answer is Yes, to about three places.



                      The simulated 95% margin of simulation error is "within 0.0006." It is easy to see that the average total is 19.583 and the corresponding simulated result is 19.580. A reasonable 2-place normal approximation is 0.0922.
                      This particular example is rich in corroborative
                      possibilities, but those are a few.



                      Note: Another issue is that, using various kinds of mathematical software, this problem could be solved exactly by advanced combinatorial methods. One exact method is based on this page, except that our dice are biased and outcomes are not equally likely. It is questionable whether simulations should be published if there is a tractable exact solution. One job of a reviewer is to identify papers that should not be published for this reason.







                      share|improve this answer














                      share|improve this answer



                      share|improve this answer








                      edited yesterday

























                      answered yesterday









                      BruceETBruceET

                      984310




                      984310























                          4














                          It is an unfortunate artifact of history and culture but academia is still in the dark ages with regards to sharing source code. Serious computational researchers will often provide at least a detailed algorithm, and often the code as well. But dabblers who had to learn how to code "on the job", or people very set in their ways and traditions, will often neglect to do so and publications standard rarely require it, even though you would think that anyone in their right mind would agree that all scientific code should be open source (or at least shared with reviewers) if it is to be worth discussion.



                          Unfortunately, the established culture does not expect sharing of code, even though the in the analogous case of physical experiments there is an expectation of sharing the exact process down to every detail of method and material such that it may be exactly reproduced by other researchers. I would suspect that the reason for this is that in the grand scheme of things computers are a relatively recent tool of science, and the ability to easily share code is still more recent. That said, we've had ubiquitous internet and zero effort code hosting like Github for over a decade now, so if you ask me, it's about damn time. But it looks like there is still quite a bit of inertia.




                          I have been reviewing the articles for various top-rank journals and conferences for the last few years. After all these experiences, I can tell you there is no way to confirm the correctness of simulation results. Therefore, I usually make comments on design, procedure, mathematical and analytical analysis.




                          That's about the best you can do. You can also try to intuitively conjecture based on the rough description (if any) of the computational approach, whether the results achieved are credible or not. But ultimately it is impossible to know for sure.



                          I try to add a little nag at the end of my reviews about releasing the source code, although I don't think it gets taken seriously very often.




                          In the results section, I can ask some questions about why this or that is so, but how can I judge if the simulation was really performed or these are just fabricated graphs?




                          Well, the way you phrase it, you can't really know if any graph or result is fabricated, unless maybe you were personally present while the research was done. There is inevitably an element of trust. But without source code, even if you do trust, you cannot offer meaningful critique about some computational parts of the paper. Obviously you can still comment on initial assumptions and the approach chosen. You can comment on how the results are interpreted. But the implementation itself is out of reach until you can see the code. Actually, even providing a detailed algorithm would not be sufficient: The authors' implementation may not necessarily be an exact match for the algorithm they intended.




                          This question came in my mind because I observed on few occasions, during review process, a reviewer ask for including new results, which in my opinion required a lot of coding and effort to implement, but author responded within 7-10 days with new results and improved article.




                          I don't think it's fair to be suspicious just because they did it a little too quickly. They may just be very good at coding. Personally, my development rate is very variable: Sometimes things just click and I can write code really fast, sometimes simple things take forever. They may know of easier ways to implement the change than you are aware. They may have already coded something similar in separate work and been able to repurpose it quickly.



                          If someone were to falsify results, I think they would either respond right away because they don't care, or wait "long enough" to avoid suspicion. If they bothered to wait at all, I don't think they would jeopardize the whole enterprise by waiting too little.






                          share|improve this answer




























                            4














                            It is an unfortunate artifact of history and culture but academia is still in the dark ages with regards to sharing source code. Serious computational researchers will often provide at least a detailed algorithm, and often the code as well. But dabblers who had to learn how to code "on the job", or people very set in their ways and traditions, will often neglect to do so and publications standard rarely require it, even though you would think that anyone in their right mind would agree that all scientific code should be open source (or at least shared with reviewers) if it is to be worth discussion.



                            Unfortunately, the established culture does not expect sharing of code, even though the in the analogous case of physical experiments there is an expectation of sharing the exact process down to every detail of method and material such that it may be exactly reproduced by other researchers. I would suspect that the reason for this is that in the grand scheme of things computers are a relatively recent tool of science, and the ability to easily share code is still more recent. That said, we've had ubiquitous internet and zero effort code hosting like Github for over a decade now, so if you ask me, it's about damn time. But it looks like there is still quite a bit of inertia.




                            I have been reviewing the articles for various top-rank journals and conferences for the last few years. After all these experiences, I can tell you there is no way to confirm the correctness of simulation results. Therefore, I usually make comments on design, procedure, mathematical and analytical analysis.




                            That's about the best you can do. You can also try to intuitively conjecture based on the rough description (if any) of the computational approach, whether the results achieved are credible or not. But ultimately it is impossible to know for sure.



                            I try to add a little nag at the end of my reviews about releasing the source code, although I don't think it gets taken seriously very often.




                            In the results section, I can ask some questions about why this or that is so, but how can I judge if the simulation was really performed or these are just fabricated graphs?




                            Well, the way you phrase it, you can't really know if any graph or result is fabricated, unless maybe you were personally present while the research was done. There is inevitably an element of trust. But without source code, even if you do trust, you cannot offer meaningful critique about some computational parts of the paper. Obviously you can still comment on initial assumptions and the approach chosen. You can comment on how the results are interpreted. But the implementation itself is out of reach until you can see the code. Actually, even providing a detailed algorithm would not be sufficient: The authors' implementation may not necessarily be an exact match for the algorithm they intended.




                            This question came in my mind because I observed on few occasions, during review process, a reviewer ask for including new results, which in my opinion required a lot of coding and effort to implement, but author responded within 7-10 days with new results and improved article.




                            I don't think it's fair to be suspicious just because they did it a little too quickly. They may just be very good at coding. Personally, my development rate is very variable: Sometimes things just click and I can write code really fast, sometimes simple things take forever. They may know of easier ways to implement the change than you are aware. They may have already coded something similar in separate work and been able to repurpose it quickly.



                            If someone were to falsify results, I think they would either respond right away because they don't care, or wait "long enough" to avoid suspicion. If they bothered to wait at all, I don't think they would jeopardize the whole enterprise by waiting too little.






                            share|improve this answer


























                              4












                              4








                              4







                              It is an unfortunate artifact of history and culture but academia is still in the dark ages with regards to sharing source code. Serious computational researchers will often provide at least a detailed algorithm, and often the code as well. But dabblers who had to learn how to code "on the job", or people very set in their ways and traditions, will often neglect to do so and publications standard rarely require it, even though you would think that anyone in their right mind would agree that all scientific code should be open source (or at least shared with reviewers) if it is to be worth discussion.



                              Unfortunately, the established culture does not expect sharing of code, even though the in the analogous case of physical experiments there is an expectation of sharing the exact process down to every detail of method and material such that it may be exactly reproduced by other researchers. I would suspect that the reason for this is that in the grand scheme of things computers are a relatively recent tool of science, and the ability to easily share code is still more recent. That said, we've had ubiquitous internet and zero effort code hosting like Github for over a decade now, so if you ask me, it's about damn time. But it looks like there is still quite a bit of inertia.




                              I have been reviewing the articles for various top-rank journals and conferences for the last few years. After all these experiences, I can tell you there is no way to confirm the correctness of simulation results. Therefore, I usually make comments on design, procedure, mathematical and analytical analysis.




                              That's about the best you can do. You can also try to intuitively conjecture based on the rough description (if any) of the computational approach, whether the results achieved are credible or not. But ultimately it is impossible to know for sure.



                              I try to add a little nag at the end of my reviews about releasing the source code, although I don't think it gets taken seriously very often.




                              In the results section, I can ask some questions about why this or that is so, but how can I judge if the simulation was really performed or these are just fabricated graphs?




                              Well, the way you phrase it, you can't really know if any graph or result is fabricated, unless maybe you were personally present while the research was done. There is inevitably an element of trust. But without source code, even if you do trust, you cannot offer meaningful critique about some computational parts of the paper. Obviously you can still comment on initial assumptions and the approach chosen. You can comment on how the results are interpreted. But the implementation itself is out of reach until you can see the code. Actually, even providing a detailed algorithm would not be sufficient: The authors' implementation may not necessarily be an exact match for the algorithm they intended.




                              This question came in my mind because I observed on few occasions, during review process, a reviewer ask for including new results, which in my opinion required a lot of coding and effort to implement, but author responded within 7-10 days with new results and improved article.




                              I don't think it's fair to be suspicious just because they did it a little too quickly. They may just be very good at coding. Personally, my development rate is very variable: Sometimes things just click and I can write code really fast, sometimes simple things take forever. They may know of easier ways to implement the change than you are aware. They may have already coded something similar in separate work and been able to repurpose it quickly.



                              If someone were to falsify results, I think they would either respond right away because they don't care, or wait "long enough" to avoid suspicion. If they bothered to wait at all, I don't think they would jeopardize the whole enterprise by waiting too little.






                              share|improve this answer













                              It is an unfortunate artifact of history and culture but academia is still in the dark ages with regards to sharing source code. Serious computational researchers will often provide at least a detailed algorithm, and often the code as well. But dabblers who had to learn how to code "on the job", or people very set in their ways and traditions, will often neglect to do so and publications standard rarely require it, even though you would think that anyone in their right mind would agree that all scientific code should be open source (or at least shared with reviewers) if it is to be worth discussion.



                              Unfortunately, the established culture does not expect sharing of code, even though the in the analogous case of physical experiments there is an expectation of sharing the exact process down to every detail of method and material such that it may be exactly reproduced by other researchers. I would suspect that the reason for this is that in the grand scheme of things computers are a relatively recent tool of science, and the ability to easily share code is still more recent. That said, we've had ubiquitous internet and zero effort code hosting like Github for over a decade now, so if you ask me, it's about damn time. But it looks like there is still quite a bit of inertia.




                              I have been reviewing the articles for various top-rank journals and conferences for the last few years. After all these experiences, I can tell you there is no way to confirm the correctness of simulation results. Therefore, I usually make comments on design, procedure, mathematical and analytical analysis.




                              That's about the best you can do. You can also try to intuitively conjecture based on the rough description (if any) of the computational approach, whether the results achieved are credible or not. But ultimately it is impossible to know for sure.



                              I try to add a little nag at the end of my reviews about releasing the source code, although I don't think it gets taken seriously very often.




                              In the results section, I can ask some questions about why this or that is so, but how can I judge if the simulation was really performed or these are just fabricated graphs?




                              Well, the way you phrase it, you can't really know if any graph or result is fabricated, unless maybe you were personally present while the research was done. There is inevitably an element of trust. But without source code, even if you do trust, you cannot offer meaningful critique about some computational parts of the paper. Obviously you can still comment on initial assumptions and the approach chosen. You can comment on how the results are interpreted. But the implementation itself is out of reach until you can see the code. Actually, even providing a detailed algorithm would not be sufficient: The authors' implementation may not necessarily be an exact match for the algorithm they intended.




                              This question came in my mind because I observed on few occasions, during review process, a reviewer ask for including new results, which in my opinion required a lot of coding and effort to implement, but author responded within 7-10 days with new results and improved article.




                              I don't think it's fair to be suspicious just because they did it a little too quickly. They may just be very good at coding. Personally, my development rate is very variable: Sometimes things just click and I can write code really fast, sometimes simple things take forever. They may know of easier ways to implement the change than you are aware. They may have already coded something similar in separate work and been able to repurpose it quickly.



                              If someone were to falsify results, I think they would either respond right away because they don't care, or wait "long enough" to avoid suspicion. If they bothered to wait at all, I don't think they would jeopardize the whole enterprise by waiting too little.







                              share|improve this answer












                              share|improve this answer



                              share|improve this answer










                              answered 23 hours ago









                              TruslyTrusly

                              69017




                              69017























                                  3














                                  Usually known examples are reproduced to gain confidence that the simulation is doing what it claims. Then further simulations show the new results, which might not be reproduced by an other group because the did not use the described method or approach described in the paper/talk.



                                  Due to brevity, often the first part is omitted in journal articles and conference talks. PhD theses contain them more often.



                                  One of my first publications, it does not contain any breakthrough, is only cited from others to show that they match my results.






                                  share|improve this answer




























                                    3














                                    Usually known examples are reproduced to gain confidence that the simulation is doing what it claims. Then further simulations show the new results, which might not be reproduced by an other group because the did not use the described method or approach described in the paper/talk.



                                    Due to brevity, often the first part is omitted in journal articles and conference talks. PhD theses contain them more often.



                                    One of my first publications, it does not contain any breakthrough, is only cited from others to show that they match my results.






                                    share|improve this answer


























                                      3












                                      3








                                      3







                                      Usually known examples are reproduced to gain confidence that the simulation is doing what it claims. Then further simulations show the new results, which might not be reproduced by an other group because the did not use the described method or approach described in the paper/talk.



                                      Due to brevity, often the first part is omitted in journal articles and conference talks. PhD theses contain them more often.



                                      One of my first publications, it does not contain any breakthrough, is only cited from others to show that they match my results.






                                      share|improve this answer













                                      Usually known examples are reproduced to gain confidence that the simulation is doing what it claims. Then further simulations show the new results, which might not be reproduced by an other group because the did not use the described method or approach described in the paper/talk.



                                      Due to brevity, often the first part is omitted in journal articles and conference talks. PhD theses contain them more often.



                                      One of my first publications, it does not contain any breakthrough, is only cited from others to show that they match my results.







                                      share|improve this answer












                                      share|improve this answer



                                      share|improve this answer










                                      answered yesterday









                                      usr1234567usr1234567

                                      2,035318




                                      2,035318























                                          -4














                                          I refer everyone to
                                          https://www.cs.auckland.ac.nz/compsci742s2c/resources/p50-kurkowski.pdf



                                          This is a well known problem in computer science (CS). Consequently, most papers in CS cannot be trusted.






                                          share|improve this answer



















                                          • 10





                                            Can you please summarise the paper in question, explain why it is relevant to this question, and give robust reference (e.g., a DOI). As it stands, this answer becomes completely useless once the link rots.

                                            – Wrzlprmft
                                            yesterday






                                          • 3





                                            "Most papers in CS" use simulation?

                                            – David Richerby
                                            yesterday











                                          • Your claim is a bold exaggeration whose apparent purpose is to create controversy rather than answer the question.

                                            – Dmitry Grigoryev
                                            16 hours ago











                                          • Looks more like a troll to me, particularly given the account name and history.

                                            – Lightness Races in Orbit
                                            13 hours ago
















                                          -4














                                          I refer everyone to
                                          https://www.cs.auckland.ac.nz/compsci742s2c/resources/p50-kurkowski.pdf



                                          This is a well known problem in computer science (CS). Consequently, most papers in CS cannot be trusted.






                                          share|improve this answer



















                                          • 10





                                            Can you please summarise the paper in question, explain why it is relevant to this question, and give robust reference (e.g., a DOI). As it stands, this answer becomes completely useless once the link rots.

                                            – Wrzlprmft
                                            yesterday






                                          • 3





                                            "Most papers in CS" use simulation?

                                            – David Richerby
                                            yesterday











                                          • Your claim is a bold exaggeration whose apparent purpose is to create controversy rather than answer the question.

                                            – Dmitry Grigoryev
                                            16 hours ago











                                          • Looks more like a troll to me, particularly given the account name and history.

                                            – Lightness Races in Orbit
                                            13 hours ago














                                          -4












                                          -4








                                          -4







                                          I refer everyone to
                                          https://www.cs.auckland.ac.nz/compsci742s2c/resources/p50-kurkowski.pdf



                                          This is a well known problem in computer science (CS). Consequently, most papers in CS cannot be trusted.






                                          share|improve this answer













                                          I refer everyone to
                                          https://www.cs.auckland.ac.nz/compsci742s2c/resources/p50-kurkowski.pdf



                                          This is a well known problem in computer science (CS). Consequently, most papers in CS cannot be trusted.







                                          share|improve this answer












                                          share|improve this answer



                                          share|improve this answer










                                          answered yesterday









                                          Prof. Santa ClausProf. Santa Claus

                                          3




                                          3








                                          • 10





                                            Can you please summarise the paper in question, explain why it is relevant to this question, and give robust reference (e.g., a DOI). As it stands, this answer becomes completely useless once the link rots.

                                            – Wrzlprmft
                                            yesterday






                                          • 3





                                            "Most papers in CS" use simulation?

                                            – David Richerby
                                            yesterday











                                          • Your claim is a bold exaggeration whose apparent purpose is to create controversy rather than answer the question.

                                            – Dmitry Grigoryev
                                            16 hours ago











                                          • Looks more like a troll to me, particularly given the account name and history.

                                            – Lightness Races in Orbit
                                            13 hours ago














                                          • 10





                                            Can you please summarise the paper in question, explain why it is relevant to this question, and give robust reference (e.g., a DOI). As it stands, this answer becomes completely useless once the link rots.

                                            – Wrzlprmft
                                            yesterday






                                          • 3





                                            "Most papers in CS" use simulation?

                                            – David Richerby
                                            yesterday











                                          • Your claim is a bold exaggeration whose apparent purpose is to create controversy rather than answer the question.

                                            – Dmitry Grigoryev
                                            16 hours ago











                                          • Looks more like a troll to me, particularly given the account name and history.

                                            – Lightness Races in Orbit
                                            13 hours ago








                                          10




                                          10





                                          Can you please summarise the paper in question, explain why it is relevant to this question, and give robust reference (e.g., a DOI). As it stands, this answer becomes completely useless once the link rots.

                                          – Wrzlprmft
                                          yesterday





                                          Can you please summarise the paper in question, explain why it is relevant to this question, and give robust reference (e.g., a DOI). As it stands, this answer becomes completely useless once the link rots.

                                          – Wrzlprmft
                                          yesterday




                                          3




                                          3





                                          "Most papers in CS" use simulation?

                                          – David Richerby
                                          yesterday





                                          "Most papers in CS" use simulation?

                                          – David Richerby
                                          yesterday













                                          Your claim is a bold exaggeration whose apparent purpose is to create controversy rather than answer the question.

                                          – Dmitry Grigoryev
                                          16 hours ago





                                          Your claim is a bold exaggeration whose apparent purpose is to create controversy rather than answer the question.

                                          – Dmitry Grigoryev
                                          16 hours ago













                                          Looks more like a troll to me, particularly given the account name and history.

                                          – Lightness Races in Orbit
                                          13 hours ago





                                          Looks more like a troll to me, particularly given the account name and history.

                                          – Lightness Races in Orbit
                                          13 hours ago





                                          protected by Alexandros 11 hours ago



                                          Thank you for your interest in this question.
                                          Because it has attracted low-quality or spam answers that had to be removed, posting an answer now requires 10 reputation on this site (the association bonus does not count).



                                          Would you like to answer one of these unanswered questions instead?



                                          Popular posts from this blog

                                          Сан-Квентин

                                          Алькесар

                                          Josef Freinademetz