Skip to main content

Table 4 Profile-profile comparison F-measures for clustered sequences

From: Evaluation and improvements of clustering algorithms for detecting remote homologous protein families

Family

Dataset

TransClust ( T )

HiFix ( s,c )

MCL ( I )

SCPS ( c )

A-10

0.741 (15)

0.652 (0.10,0.7)

0.693 (26)

-

A-20

0.749 (15)

0.685 (0.15,0.7)

0.703 (24)

-

A-30

0.750 (15)

0.695 (0.15,0.7)

0.707 (24)

-

A-50

0.751 (20)

0.702 (0.20,0.7)

0.709 (18)

-

A-70

0.753 (20)

0.713 (0.20,0.6)

0.712 (17)

-

A-90

0.767 (15)

0.717 (0.20,0.6)

0.715 (17)

-

A-95

0.769 (15)

0.725 (0.20,0.7)

0.743 (17)

-

GOLD

0.959 (50)

0.921 (0.30,0.6)

0.925 (15)

-

Super-family

A-10

0.722 (1)

0.699 (0.10,0.6)

0.752 (59)

0.750 (648)

A-20

0.783 (5)

0.701 (0.10,0.7)

0.754 (59)

0.759 (753)

A-30

0.809 (5)

0.705 (0.10,0.7)

0.778 (59)

0.777 (955)

A-50

0.827 (5)

0.710 (0.15,0.7)

0.781 (58)

0.789 (1188)

A-70

0.833 (5)

0.711 (0.15,0.7)

0.788 (60)

0.792 (1279)

A-90

0.835 (5)

0.715 (0.15,0.7)

0.805 (59)

0.805 (1345)

A-95

0.837 (5)

0.716 (0.15,0.7)

0.807 (60)

0.805 (1401)

GOLD

0.999 (1)

0.974 (0.05,0.5)

1.000 (60)

1.000 (6)

  1. The optimized set of parameters determined for each clustering algorithm are shown in parenthesis, see Section ‘Parameter optimization’. Best values are shown in bold.