Skip to main content

Table 2 Clustering results of the six subsets from the COG database. Number of clusters obtained by clustering the protein sequences of the six randomly generated subsets from the COG database (rows) with each of the clustering algorithms tested (columns). To each execution time of TRIBE-MCL [8] and gSPC [9], we added the corresponding execution time of ClustalW [38] used to compute the similarity matrix. Time is indicated in seconds.

From: CLUSS: Clustering of protein sequences based on a new similarity measure

Protein subsets

CLUSS

BLAST

MCL+ClustalW

SPC+ClustalW

Nbr

Time

Nbr

Time

Nbr

Time

Nbr

Time

SS1 (469 proteins)

30

106

114

14

1

495

9

499

SS2 (743 proteins)

15

234

102

58

1

1272

33

1275

SS3 (455 proteins)

30

114

132

18

8

586

27

588

SS4 (409 proteins)

19

82

125

11

1

452

4

454

SS5 (564 proteins)

35

103

172

15

6

538

30

540

SS6 (6444 proteins)

225

4272

732

583

1

95895

77

98880