Fig. 1
From: BAMSI: a multi-cloud service for scalable distributed filtering of massive genome data

Overview of the architecture. The user defines a data filtering job in a graphical user interface or using a REST API. The routing engine distributes tasks to workers residing in one or several cloud platforms, each with a configured source of the data. The filtered results can be routed to a permanent or transient storage location (such as an HDFS cluster) for further downstream analysis with other tools, or for download via the interface