A Realistic Guide to Making Data Available Alongside Code to Improve Reproducibility

  title={A Realistic Guide to Making Data Available Alongside Code to Improve Reproducibility},
  author={Nicholas J. Tierney and Karthik Ram},
Data makes science possible. Sharing data improves visibility, and makes the research process transparent. This increases trust in the work, and allows for independent reproduction of results. However, a large proportion of data from published research is often only available to the original authors. Despite the obvious benefits of sharing data, and scientists' advocating for the importance of sharing data, most advice on sharing data discusses its broader benefits, rather than the practical… 

Figures and Tables from this paper

DataverseNO: A National, Generic Repository and its Contribution to the Increased FAIRness of Data from the Long Tail of Research
The organization and operation of DataverseNO is presented, and how the repository contributes to the increased FAIRness of small and medium sized research data is investigated.
Best Coding Practices to Ensure Reproducibility
S and automates functionality in a way so that each piece does one thing only and does it well and leaves a trace of your thinking process if some decisions are not obvious from the code itself.
Comment on: ‘Moving Sport and Exercise Science Forward: A Call for the Adoption of More Transparent Research Practices’
Another significant component of open science—the release of analytical materials (i.e., data and code used for analysis) alongside a manuscript—is highlighted with the hope to further improve transparency and reproducibility in sport and exercise science research.
Accessibility of Tables in PDF Documents: Issues, Challenges, and Future Directions
This review paper reports on the current state of the accessibility of PDF documents, digital libraries, assistive technologies, tools, and frameworks that make PDF tables comprehensible and accessible to blind and visually impaired people.
Analytical and Computational Advances, Opportunities, and Challenges in Marine Organic Biogeochemistry in an Era of “Omics”
Advances in sampling tools, analytical methods, and data handling capabilities have been fundamental to the growth of marine organic biogeochemistry over the past four decades. There has always been


How to Share Data for Collaboration
The need to provide raw data to the statistician, the importance of consistent formatting, and the necessity of including all essential experimental information and pre-processing steps carried out to the statisticsian are highlighted.
Git can facilitate greater reproducibility and increased transparency in science
  • Karthik Ram
  • Computer Science
    Source Code for Biology and Medicine
  • 2013
An overview of Git is provided along with use-cases that highlight how this tool can be leveraged to make science more reproducible and transparent, foster new collaborations, and support novel uses.
The FAIR Guiding Principles for scientific data management and stewardship
This Comment is the first formal publication of the FAIR Principles, and includes the rationale behind them, and some exemplar implementations in the community.
An empirical analysis of journal policy effectiveness for computational reproducibility
This work evaluates the effectiveness of journal policy that requires the data and code necessary for reproducibility be made available postpublication by the authors upon request and finds it to be an improvement over no policy, but currently insufficient for reproducecibility.
Supporting data sharing
To serve paper publishing, for example, with its physical constraints of space and two-dimensional presentation, it has needed to produce data that can be analyzed in such a way that the results can be communicated in conventional tables, figures and graphs as well as natural language.
Sharing in Ecology and Evolution Nine simple ways to make it easier to ( re ) use your data
Nine simple ways to make it easy to reuse the data that you share and also make it easier to work with it yourself to allow reuse and to help you to understand and use the data.
Frictionless Data: Making Research Data Quality Visible
This paper will report on current progress toward Frictionless Data, a containerization format for data based on existing practices for publishing open source software, and a set of tools, specifications, and best practices for describing, publishing, and validating data.
Has open data arrived at the British Medical Journal (BMJ)? An observational study
Despite the BMJ's strong data sharing policy, sharing rates are low and it might be time for a more effective datasharing policy and better incentives for health and medical researchers to share their data.
Statistical Analyses and Reproducible Research
This article describes a software framework for both authoring and distributing integrated, dynamic documents that contain text, code, data, and any auxiliary content needed to recreate the computations in data analyses, methodological descriptions, simulations, and so on.
How open science helps researchers succeed
There is evidence that open research practices bring significant benefits to researchers relative to more traditional closed practices, including increases in citations, media attention, potential collaborators, job opportunities and funding opportunities.