Skip to main content

Table 1 CK-36-PDB

From: Compression-based classification of biological sequences and structures via the Universal Similarity Metric: experimental assessment

CK-36-PDB UCD NCD CD
  UPGMA NJ UPGMA NJ UPGMA NJ
Gzip 0.7665 0.7454 0.8196 0.7603 0.7360 0.7000
Bzip2 0.7872 0.7069 0.7656 0.7130 0.7452 0.6685
PPMd16 0.9605 0.8072 0.9605 0.9024 0.7850 0.7403
PPMd8 0.9605 0.8072 0.9605 0.9024 0.9030 0.7820
PPMd4 0.9605 0.8146 0.9605 0.9024 0.9030 0.7820
PPMd2 0.9351 0.8072 0.9420 0.7603 0.8881 0.7450
Huffman 0.8004 0.7224 0.8004 0.7224 0.7541 0.7233
Ac fast 0.8274 0.7419 0.8274 0.7419 0.7541 0.7362
Rc fast 0.8216 0.7308 0.8004 0.7308 0.7708 0.7691
Ac med. 0.8004 0.7276 0.8274 0.7276 0.7611 0.8111
Rc med. 0.8274 0.7276 0.8447 0.7234 0.7708 0.7223
Ac slow 0.8447 0.7331 0.8274 0.7331 0.7708 0.7223
Rc slow 0.8274 0.7331 0.8447 0.7276 0.7708 0.7747
BwtRleHuff 0.8666 0.7789 0.8778 0.7789 0.7950 0.7609
BwtMtfRleHuff 0.7850 0.7577 0.7950 0.7773 0.7625 0.7424
BwtRleAc fast 0.7944 0.7677 0.7944 0.7677 0.7850 0.7218
BwtMtfRleAc fast 0.8045 0.8046 0.8320 0.7577 0.7452 0.7019
BwtRleRc fast 0.8778 0.7677 0.8778 0.7677 0.7804 0.7505
BwtMtfRleRc fast 0.8309 0.8046 0.8309 0.8046 0.7619 0.7172
BwtRleRc med. 0.8778 0.7789 0.8778 0.7789 0.7950 0.7655
BwtMtfRleRc med. 0.8347 0.8046 0.8135 0.7577 0.7619 0.6970
BwtRleRc slow 0.8666 0.7677 0.8666 0.7603 0.7850 0.7933
BwtMtfRleRc slow 0.8420 0.7503 0.8235 0.7503 0.7619 0.6970
BwtWavelet 0.8497 0.9186 0.8497 0.8281 0.7486 0.6970
  1. Experimental results for the CK-36-PDB data set, with the UCD (left), NCD (middle), and CD (right) distance. For each compression algorithm, we report the F-measure for both UPGMA and NJ methods. The F-measure is a value ranging from 0 for highest dissimilarity to 1 for identical classifications.