Skip to main content

Table 1 Execution times for different node and data-set configurations. The execution times (in seconds) are collected for each step of the pipeline: 1) Blast on ESTs: functional annotation of raw EST sequences; 2) Pre-processing: vector contaminations cleaning and low complexity and interspersed repeat sequences masking; 3) Clustering; 4) Assembling; 5) Blast on Contigs: functional annotation of consensus sequences. Tot: is the global execution time of the pipeline.

From: ParPEST: a pipeline for EST data analysis based on parallel computing

 

#sequences

Blast on ESTs

Pre-processing

Clustering

Assembling

Blast on Contigs

TOT

4 nodes

250

3712

441

15

201

501

4870

 

500

7072

613

15

201

441

8342

 

1000

13643

857

30

202

1474

16206

 

5000

70490

2979

150

257

6806

80682

 

10000

14559

6029

346

328

16045

168287

6 nodes

250

1992

441

15

201

350

2999

 

500

3648

443

15

201

421

4728

 

1000

6911

847

30

212

903

8903

 

5000

35647

2834

136

268

4137

43022

 

10000

72525

5483

240

357

7845

86450

8 nodes

250

1600

441

15

201

280

2537

 

500

2517

443

15

202

461

3910

 

1000

4704

797

30

212

733

6476

 

5000

23819

2784

121

267

2853

29844

 

10000

48700

5377

240

357

7845

62519