Skip to main content

Table 1 CK-36-PDB

From: Compression-based classification of biological sequences and structures via the Universal Similarity Metric: experimental assessment

CK-36-PDB

UCD

NCD

CD

 

UPGMA

NJ

UPGMA

NJ

UPGMA

NJ

Gzip

0.7665

0.7454

0.8196

0.7603

0.7360

0.7000

Bzip2

0.7872

0.7069

0.7656

0.7130

0.7452

0.6685

PPMd16

0.9605

0.8072

0.9605

0.9024

0.7850

0.7403

PPMd8

0.9605

0.8072

0.9605

0.9024

0.9030

0.7820

PPMd4

0.9605

0.8146

0.9605

0.9024

0.9030

0.7820

PPMd2

0.9351

0.8072

0.9420

0.7603

0.8881

0.7450

Huffman

0.8004

0.7224

0.8004

0.7224

0.7541

0.7233

Ac fast

0.8274

0.7419

0.8274

0.7419

0.7541

0.7362

Rc fast

0.8216

0.7308

0.8004

0.7308

0.7708

0.7691

Ac med.

0.8004

0.7276

0.8274

0.7276

0.7611

0.8111

Rc med.

0.8274

0.7276

0.8447

0.7234

0.7708

0.7223

Ac slow

0.8447

0.7331

0.8274

0.7331

0.7708

0.7223

Rc slow

0.8274

0.7331

0.8447

0.7276

0.7708

0.7747

BwtRleHuff

0.8666

0.7789

0.8778

0.7789

0.7950

0.7609

BwtMtfRleHuff

0.7850

0.7577

0.7950

0.7773

0.7625

0.7424

BwtRleAc fast

0.7944

0.7677

0.7944

0.7677

0.7850

0.7218

BwtMtfRleAc fast

0.8045

0.8046

0.8320

0.7577

0.7452

0.7019

BwtRleRc fast

0.8778

0.7677

0.8778

0.7677

0.7804

0.7505

BwtMtfRleRc fast

0.8309

0.8046

0.8309

0.8046

0.7619

0.7172

BwtRleRc med.

0.8778

0.7789

0.8778

0.7789

0.7950

0.7655

BwtMtfRleRc med.

0.8347

0.8046

0.8135

0.7577

0.7619

0.6970

BwtRleRc slow

0.8666

0.7677

0.8666

0.7603

0.7850

0.7933

BwtMtfRleRc slow

0.8420

0.7503

0.8235

0.7503

0.7619

0.6970

BwtWavelet

0.8497

0.9186

0.8497

0.8281

0.7486

0.6970

  1. Experimental results for the CK-36-PDB data set, with the UCD (left), NCD (middle), and CD (right) distance. For each compression algorithm, we report the F-measure for both UPGMA and NJ methods. The F-measure is a value ranging from 0 for highest dissimilarity to 1 for identical classifications.