Skip to main content
Fig. 2 | BMC Bioinformatics

Fig. 2

From: iRODS metadata management for a cancer genome analysis workflow

Fig. 2

Solution architecture overview NGS data is stored on the institute’s physical storage and the input metadata is prepared. An import script validates the metadata and passes it to a virtual machine providing iRODS server resources. Up to this VM, all other machines have to use iCommand Clients or APIs to download, upload or query the data. All iRODS iCommand clients have been set up to communicate through host-certificate based SSL encryption. The iRODS Server has a main vault directory, which holds the data archive and is owned exclusively by the iRODS user. It is physically located on the GridScaler Storage System of the HPC (CHEOPS) and made available to iRODS via an NFS mount, which can also be accessed only by the irods user. In particular, we have decided to let iRODS maintain its data within the vault but restrict it to a dedicated server with stringent access policies. With such a setup any data located in the vault is shielded from all CHEOPS users through locally managed file ownership and permissions but the security within the vault depends entirely on the strength and infallibility of iRODS authorization mechanisms. To run the NGS pipeline on the HPC (CHEOPS), run metadata is prepared and a master script orchestrates the communication with iRODS to query and retrieve the input data and to import the results after successful processing

Back to article page