Skip to main content

Table 1 Execution times for different node and data-set configurations. The execution times (in seconds) are collected for each step of the pipeline: 1) Blast on ESTs: functional annotation of raw EST sequences; 2) Pre-processing: vector contaminations cleaning and low complexity and interspersed repeat sequences masking; 3) Clustering; 4) Assembling; 5) Blast on Contigs: functional annotation of consensus sequences. Tot: is the global execution time of the pipeline.

From: ParPEST: a pipeline for EST data analysis based on parallel computing

  #sequences Blast on ESTs Pre-processing Clustering Assembling Blast on Contigs TOT
4 nodes 250 3712 441 15 201 501 4870
  500 7072 613 15 201 441 8342
  1000 13643 857 30 202 1474 16206
  5000 70490 2979 150 257 6806 80682
  10000 14559 6029 346 328 16045 168287
6 nodes 250 1992 441 15 201 350 2999
  500 3648 443 15 201 421 4728
  1000 6911 847 30 212 903 8903
  5000 35647 2834 136 268 4137 43022
  10000 72525 5483 240 357 7845 86450
8 nodes 250 1600 441 15 201 280 2537
  500 2517 443 15 202 461 3910
  1000 4704 797 30 212 733 6476
  5000 23819 2784 121 267 2853 29844
  10000 48700 5377 240 357 7845 62519