Code Ocean, MATLAB, and Sharing Reusable Code
Today we have two guest bloggers, Lisa Kempler and Pradeep Ramamoorthy, who work at MathWorks in Natick, Massachusetts supporting and developing online tools for researchers. Their post talks about a relatively new code-sharing platform, Code Ocean.
Contents
What is Code Ocean?
Code Ocean is a cloud-based platform aimed at furthering computational reproducibility and open research. The site, accessible via a web browser, enables researchers to share code and data associated with their published research. Visitors to the site can view and run the code, thereby verifying that the code produces the results described in the original research paper. The platform supports a variety of programming languages, including MATLAB.
In an effort to provide easier access to code and data, Code Ocean recently announced the ability to export compute capsules:
https://medium.com/codeocean/new-compute-capsules-now-exportable-from-code-ocean-54b5bacb3e0e
Users can now download these code capsules, or containers -- encapsulations of code, data and computational environment – in order to reuse and build on the published research and code, including the computational environment. MATLAB users who download compute capsules containing MATLAB code can run the code and view the associated results on their local computers.
From Open Science to Reusable Research
Reproducible Research (RR) has been a big push for a long time by publishers and funding organizations that want to ensure that research is sufficiently vetted. There are two major benefits of good, verifiable research: 1) publications provide high-value information, and 2) those researchers doing follow-on research can confidently build on the work of their peers.
Researchers’ desire to leverage historical research has led to a movement around open science, or, more broadly, open research. The primary goal of the openness is the same as the underlying drivers of RR; if you make sure the results can be reproduced, then it’s reasonable to build on those results. However, “open” takes it a step further. Openness pushes RR beyond proving the validity of the research to Reuse – a requirement to make the research approaches and resulting artifacts broadly accessible.
Using Code Ocean, published authors can reproduce and verify their research results. However, Code Ocean’s main value to researchers is this ability to reuse the work of their published peers.
Although the ability to download the code, associated data, and related graphical and numeric output gives researchers a huge head-start, buy-in by researchers who submit for publication is still limited. In a recent article (An empirical analysis of journal policy effectiveness for computational reproducibility), Stodden et al demonstrate the lack of engagement in RR by most researchers. The study deemed 56 of 204 published papers computationally reproducible, even after multiple attempts to get additional information from authors of the remaining 148. The study’s finding, 25% compliance for published papers that are inherently computational, tells us that the norm is still 1) non-reproducibility and 2) not-so-transparent paths to Reuse for most published computational research.
Having MATLAB language support on Code Ocean makes it easier for researchers to share their work. Using these tested outputs, MATLAB users can create new research, and transfer their learnings to new innovations and products in science and industry. Code Ocean’s easy-upload and sharing platform holds the possibility of increasing RR compliance (and, in turn, Reuse-ability), as publishers, authors, and follow-on researchers see the value in sharing.
What is a Compute Capsule?
Compute capsules are the foundational units on Code Ocean. They encapsulate the elements required to reproduce and reuse research – code, data, documentation and a specification of the computational environment. Researchers create a compute capsule associated with their research, and visitors open these capsules to examine and run the code.
Exporting a Compute Capsule
Let’s say you’re a researcher working in the field of neuroscience. You hear about ongoing research and development of models for simulating brain fibers.
Once you log in to the Code Ocean website (setting up an account is quick, and free), you can explore the curated gallery of published compute capsules, or search for relevant terms. If you search for ‘fiber’ or ‘brain’, you see relevant results, as shown below:
The 1st search result – Fiber Source Separation – looks promising and might be what you were looking for. Clicking on the link will take you to the Code Ocean IDE, which allows you to interact with the code, look at supporting documents and visualizations, and run the code on Code Ocean’s cloud platform.
To export this capsule, just select the ‘Export’ option from the ‘Capsule’ menu.
Selecting this option should initialize the download process. Once downloaded, you can then extract the downloaded package. REPRODUCING.md, below, is your read-me file, with the steps needed to reproduce the results of the capsule. The next step, unpacking the capsule, requires you to install Docker and some experience using Docker.
Summary
The ability to view and reuse code associated with published research is a big plus. Having the bi-directional linkage between the code and the published article, from papers on publisher sites to the code and from Code Ocean capsules back to the papers, makes it easy to find and use the different related components. If you have a published paper with associated MATLAB code, consider uploading it to Code Ocean. Or visit Code Ocean to view and download research-related MATLAB code.
Have you used Code Ocean (or similar platforms) for your research and code-sharing needs? Let us know here.