Resources about reproducibility with specific tools, languages, or methods.


💻 Computers & 🗄️ data

  • Test-Driven Data Analysis (TDDA), automatically generating reference tests for data workflows > https://github.com/tdda/tdda
    Talk recording:

🇷 R

🐍 Python

🧮 Matlab

🤯 Machine Learning (ML) / Artificial Intelligence (AI)

🦯 Blind peer review

🔬 Microscopes

GitHub

  • Perez-Riverol Y, Gatto L, Wang R, Sachsenberg T, Uszkoreit J, Leprevost FdV, et al. (2016) Ten Simple Rules for Taking Advantage of Git and GitHub. PLoS Comput Biol 12(7): e1004947. https://doi.org/10.1371/journal.pcbi.1004947
  • Crystal-Ornelas, Robert, Brandon Edwards, Katherine Hébert, Emma J. Hudgins, Luna L. Sánchez-Reyes, Eric R. Scott, Matthew Grainger, et al. 2022. Not Just for Programmers: How Github Can Accelerate Collaborative and Reproducible Research in Ecology and Evolution. MetaArXiv. July 13. https://doi.org/10.31222/osf.io/x3p2q

🧾 Software and data citation and licensing

At some point the citation and licensing of software can become important for researchers. One structural problem of current science is, that very often scientific research software and research data is not covered by the commonly used 'success metrics' for scientific careers. Ensuring proper citation of software and data should thus be of high importance for developers and researchers alike. This holds for both your own software and data (making it citable, citing it) and software you use. Data sharing can benefit your scientific career by leading to greater collaboration, increased confidence in findings and goodwill between researchers (https://doi.org/10.1038/d41586-019-01506-x). Furthermore, several studies have shown that articles making data available have a citation benefit and data are actually reused (https://doi.org/10.7717/peerj.175, https://doi.org/10.1371/journal.pone.0230416). The same can be argued for software.

Licensing is important to keep in mind when starting to share, collaboratively develop, or reuse code and data. It's worth getting a quick overview of what copyright is (https://en.wikipedia.org/wiki/Copyright) and to acknowledge that (i) copyright law is very diverse across legal jurisdictions, (ii) the laws and their application for "modern" things like data and software are partly still evolving, and (iii) we need copyright law to be able to allow people to use our work. Important disclaimer: the information provided here is not legal advice. If you are unsure about copyright and licensing, consult your lawyer.

"The Legal Framework for Reproducible Scientific Research - Licensing and Copyright" (https://doi.org/10.1109/MCSE.2009.19, public PDF at https://academiccommons.columbia.edu/doi/10.7916/D8GH9TD8/download) gives you a good overview and provides clear recommendations on practices and licenses, as is "A Quick Guide to Software Licensing for the Scientist-Programmer" (https://doi.org/10.1371/journal.pcbi.1002598). If you want to make sure the licenses of software you use or share supports your intentions and do not stand in conflict with each other, TL;DR Legal can help you out: https://tldrlegal.com/. The website http://forschungslizenzen.de informs about rights and licenses for research data (German only) with a special focus on the humanities.

The Software Sustainability Institute's page "How to cite and describe software" is a great starting point for software citation (https://www.software.ac.uk/how-cite-software), albeit being a bit outdated. A more current article ais "Recognizing the value of software: a software citation guide" (https://doi.org/10.12688/f1000research.26932.2), as it includes recent initiatives such as Software Heritage (https://www.softwareheritage.org/). If you use a modern reference managers, the biblatex-software style (https://www.softwareheritage.org/2020/05/26/citing-software-with-style/) might be useful. The GitHub-Zenodo integration makes getting a citable DOI for every release very easy (https://guides.github.com/activities/citable-code/), but manual publishing from GitLab(.com, ZIV-GitLab) is almost as simple. Pro-tip: look for .zenodo.json files on GitHub to automate the metadata insertion on Zenodo and consider publishing a software paper in JOSS (https://joss.theoj.org/) or JORS (https://openresearchsoftware.metajnl.com/).

For data citation, university libraries and data repositories are your places to go. Data publication is part of more established practices around research data management (RDM, Forschungsdatenmanagement - FDM) and often is required by funders. Therefore, all universities have services in this area (https://www.uni-muenster.de/Forschungsdaten/) and the more static and less evolving nature of data, compared to software, makes some things easier as well. Generic information can be found at DataCite ("Cite your Data", https://datacite.org/cite-your-data.html) and DataVerse (https://dataverse.org/best-practices/data-citation). Open Data Commons (https://opendatacommons.org/) provides established licenses to use and an excellent FAQ (https://opendatacommons.org/faq/licenses/).

In a nutshell 🐿️:

  1. Make your own data and software citable and provide the desired citation in your README.
  2. Make your data and software usable by others by using open licenses (data licenses for data, software licenses for software).
  3. Put software on Zenodo and/or Software Heritage.
  4. Put data in a suitable research data repository, which you can find on https://www.re3data.org/.
  5. Cite all data and software that you use with their proper version and DOI. If data use reuse does not have a DOI, ask the author to make it citable.

🖥️ Research Software Engineering & Software publishing



You think something is missing on this page? Get in touch with o2r.support@uni-muenster.de. Thanks!


  • Keine Stichwörter