Skip to main content
Fig. 5 | BMC Bioinformatics

Fig. 5

From: Scalable analysis of Big pathology image data cohorts using efficient methods and high-performance computing strategies

Fig. 5

Strategy for high throughput processing of images. (a) Execution on multiple nodes (left) is accomplished using a Manager-Worker model, in which stage tasks are assigned to Workers in a demand-driven fashion. A stage task is represented as a tuple of (stage name, data). The stage name may be “segmentation” in which case data will be an image tile, or it may be “feature computation” in which case data will be a mask representing segmented nuclei in an image tile and the image tile itself. The runtime system schedules stage tasks to available nodes while enforcing dependencies in the analysis pipeline and handles movement of data between stages. A node may be assigned multiple stage tasks. (b) A stage task scheduled to a Worker (right) is represented as a dataflow of operations for the segmentation stage and a set of operations for the feature computation stage. These operations are scheduled to CPU cores and GPUs by the Worker Resource Manager (WRM). The WRM uses the priority queue structure (shown as “sorted by speedup rectangle” in the figure) to dynamically schedule a waiting operation to an available computing devices

Back to article page