Abstract

Reproducibility in the computational sciences seems to be capturing everyone’s attention. Movements to address the reliability of published computational results are arising in fields as disparate as geophysics, political science, fluid dynamics, computational harmonic analysis, fMRI research, and bioinformatics. Open data and code in climate modeling has taken on a new priority since ClimateGate in 2009 (i.e., www.nature.com/ news/2010/101013/full/467753a.html), and Amstat News has recounted efforts to ensure reproducibility in genomics research in the wake of the termination of clinical trials at Duke University in December of 2010 (see http://magazine.amstat.org/blog/2011/01/01/ scipolicyjan11). These efforts are essential to addressing the “credibility crisis” in science. It is impossible to believe most of the computational results presented at conferences and in published papers today. Even mature branches of science, despite all their efforts, suffer severely from the problem of errors in final published conclusions. Traditional scientific publication is incapable of finding and rooting out errors in scientific computation, and standards of verifiability must be developed. A smattering of these efforts will give a sense of the scope at which the community is addressing this issue. The Institute of Medicine of the National Academies is undertaking a consensus study titled “Review of Omics-Based Tests for Predicting Patient Outcome in Clinical Trials,” (see www.iom.edu/Activities/Research/OmicsBasedTests. aspx), and sessions on reproducibility were held at SIAM Geosciences 2011, this year’s AAAS annual meeting, and SIAM Computing in Science and Engineering 2011. Also, a three-day workshop is to be held at Applied Mathematics Perspectives this month. In 2009, stakeholders from biology, computational chemistry, geophysics, law, astronomy, and other fields collectively drafted a declaration on data and code sharing in the computational sciences (www.computer.org/portal/web/csdl/doi/10.1109/ MCSE.2010.113 and www.stanford.edu/~vcs/ Conferences/RoundtableNov212009). Since January of this year, the National Science Foundation has required data management plans to be peer reviewed with every grant application. Open access to data and software are relevant to advancing trustworthy science and are discussed in the 2010 reauthorization of the America Competes Act (see http://blog.stodden.net/2011/05/27/regulatory-steps-toward-open-science-and-reproducibility-weneed-a-science-cloud). As we embrace and tackle this issue across the computational sciences, concepts are inevitably labeled with different terms and different concepts are emphasized. I will touch on the semantic and substantive differences in the various approaches to reliability in computational and dataenabled sciences.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call