Our software is a Python and Django based application that can be installed and run with minimal system administration knowledge and is aimed to be deployed to serve individual research groups.
Our software also offers project-based laboratory data management. Within the management interface, all content is grouped into projects that may have public or private visibility. Content stored in public projects is readable without restrictions. Private projects will restrict access to members only. Within each project content is divided into three main categories:
-
1.
Data (the input files)
-
2.
Recipes (the code that processes the data)
-
3.
Results (the directory that contains the resulting files of applying the recipe to data)
Figure 1 shows a project view with Data, Recipes and Results displayed in separate tabs of the project. A typical workflow requires that one or more Data are combined with a Recipe to produce a Result: Data + Recipe - > Results.
First the data section must be populated. Data may be uploaded or may be linked directly from a hard drive or from a mounted filesystem, thus avoiding copying and transferring large datasets over the web. For recipes that connect to the internet to download data, for example when downloading from the Short Read Archive the data does not need to be already present in the local server.
Notably the concept of “data” in our system is broader and more generic than that for a typical file system. In our software “data” may be a single file, it may be a compressed archive containing several files or it may be a path to a directory that contains any number of files as well as other subdirectories. The programming interfaces for recipes can handle directories transparently and make it possible to run the same recipes that one would use for a single file on all files of an entire directory.
Each recipe may be assigned a graphical user interface specification code in TOML format. From the TOML code the recipe website will generate a user interface, connected to the underlying data analysis script. For example, the TOML code (partially shown) below:

would generate the interface shown in Fig. 2. When a recipe is executed the parameters selected on the graphical user interface will replace the corresponding parameters inside the recipe. The interface generation “specification language” provides the building blocks for creating user interfaces.
The code for each recipe may be inspected before executing the recipe as seen in Fig. 3. Notably the recipe code consists of executable instructions that may be run on other platforms.
Running a recipe on data entry produces a “result” directory. Result directories consists of all files and all the metadata created by the recipe as it is executed on the input data. Each run of a recipe will generate a new result directory. Users may inspect, investigate and download any of the files generated during the recipe run. Additionally, users may copy a result file as new data input for another recipe.
Upon executing a recipe on a dataset, a result directory is generated that lists all files created during the recipe run. See Fig. 4. In addition, all messages printed on the standard output or standard error streams are captured as files and may be inspected later.
The web application that we have developed also provides laboratory data management services. Recipes, data and results can be copied across projects, users may create new projects and may allow others (or the public) to access the contents of a project. As constructed, the web application provides a transparent and consistent framework to conduct analyses that can be shared among collaborators or with the public, and may be reproduced over time due the preservation of runtime-specific version of the code.