Skip to main content

Table 1 Sequence-sequence comparison F-measure for clustered sequences

From: Evaluation and improvements of clustering algorithms for detecting remote homologous protein families

Family
  TransClust HiFix MCL SCPS
Dataset F-measure Clusters Precision Recall F-measure Clusters Precision Recall F-measure Clusters Precision Recall F-measure Clusters Precision Recall
A-10 0.494 1757 0.834 0.409 0.467 2780 0.463 0.692 0.352 2310 0.923 0.389 -  
A-20 0.573 2013 0.885 0.494 0.491 3270 0.556 0.732 0.398 4125 0.999 0.278 -  
A-30 0.675 2561 0.912 0.628 0.583 3749 0.561 0.885 0.415 1827 0.351 0.773 -  
A-50 0.721 3221 0.903 0.709 0.608 4861 0.562 0.945 0.457 1912 0.702 0.445 -  
A-70 0.739 3486 0.904 0.733 0.630 4921 0.616 0.873 0.474 2323 0.752 0.482 -  
A-90 0.758 3630 0.913 0.753 0.653 4973 0.625 0.895 0.511 2824 0.815 0.512 -  
A-95 0.766 3715 0.916 0.765 0.654 4992 0.629 0.907 0.527 2873 0.527 0.813 -  
GOLD 0.914 96 0.905 0.968 0.902 99 0.960 0.895 0.880 56 0.808 0.942 -  
Super-family
A-10 0.377 1757 0.917 0.281 0.337 2780 0.993 0.274 0.270 3270 0.997 0.180 0.297 658 0.387 0.221
A-20 0.450 2013 0.954 0.347 0.362 3270 0.993 0.293 0.282 4024 0.999 0.191 0.352 701 0.400 0.323
A-30 0.551 2561 0.551 0.440 0.473 3749 0.994 0.414 0.333 3745 0.998 0.235 0.473 792 0.494 0.364
A-50 0.609 3221 0.995 0.499 0.507 4861 0.992 0.457 0.351 3048 0.847 0.310 0.557 753 0.618 0.546
A-70 0.631 3486 0.997 0.519 0.539 4921 0.990 0.495 0.377 2086 0.875 0.335 0.581 493 0.649 0.518
A-90 0.654 3630 0.996 0.544 0.560 4973 0.989 0.528 0.426 2549 0.922 0.364 0.607 633 0.680 0.531
A-95 0.659 3715 0.996 0.552 0.563 4986 0.990 0.542 0.435 2616 0.912 0.378 0.615 940 0.686 0.542
GOLD 0.865 23 1 0.765 0.915 13 0.998 0.852 0.827 24 1 0.712 0.904 4 0.864 0.983
  1. Number of clusters found, and weighted mean precision and recall values for each clustering algorithm are shown. Best values are shown in bold.