Skip to main content

Table 2 RACS scaling and performance trends for the ORF part of the pipeline: we performed the standard strong scaling analysis, as well as a function of different dataset sizes

From: RACS: rapid analysis of ChIP-Seq data for contig based genomes

Initial data size Number of procesors Workspace usage Walltime time
≈3 GBa 1 ≈27 GB 7037 secs
  2 " 5059 secs
  4 " 3856 secs
  8 " 3238 secs
  16 " 2940 secs
  32 " 2801 secs
  64b " 2463 secs
≈2.4 GBc 1 ≈20 GB 5477 secs
  2 " 4005 secs
  4 " 3128 secs
  8 " 2678 secs
  16 " 2456 secs
  32 " 2344 secs
  64 " 2161 secs
≈6.8 GBd 1 ≈50.3 GB 6987 secs
  2 " 5662 secs
  4 " 4864 secs
  8 " 4451 secs
  16 " 4245 secs
  32 " 4148 secs
  64 " 4155 secs
≈7.1 GBe 1 ≈53.4 GB 7728 secs
  2 " 6191 secs
  4 " 5255 secs
  8 " 4740 secs
  16 " 4529 secs
  32 " 4413 secs
  64 " 4249 secs
≈1.4 GBf 1 ≈8.3 GB 2874 secs
  2 " 1796 secs
  4 " 1218 secs
  8 " 920 secs
  16 " 773 secs
  32 " 702 secs
  64 " 639 secs
  1. aIbd1-1 data set for T.thermophila [16].
  2. bAlthough there are 40 physical cores in the TDS/Niagara nodes, hyperthreading is enabled so it can be used up to 80 logical cores.
  3. cIbd1-2 data set for T.thermophila [16].
  4. dMED31-1 data set for T.thermophila [48].
  5. eMED31-2 data set for T.thermophila [48].
  6. fData set for O.trifallax.
  7. As it can be seen, the working space (in this case memory utilization) can reach up to a factor of 9-10 × the size of the initial data to be processed. Further details about memory consumption can be found in the README document and the “doc” directory, included within the RACS repository. These tests were run in the TDS system (i.e. one Lenovo SD530 node with 40 cores and 192GB of RAM with CentOS 7.4 operating system) of the Niagara supercomputer [27], utilizing RAMDISK as working space