Skip to main content

Table 1 Experimental results on the remote homology detection task for all competing methods using the triple(1,3) kernel.

From: Efficient use of unlabeled data for protein sequence classification: a comparative study

Ā 

neighborhood (no clustering)

clustered neighborhood

dataset

ROC

ROC50

p-value

ROC

ROC50

p-value

PDB

Ā Ā Ā Ā Ā Ā 

full sequence

.9476

.7582

-

.9515

.7633

-

region

.9708

.8265

.0069

.9716

.8246

.0045

no tails (full seq.)

.9443

.7522

.5401

.9472

.7559

.5324

max length (full seq.)

.9471

.7497

.4407

.9536

.7584

.5468

Swiss-Prot

Ā Ā Ā Ā Ā Ā 

full sequence

.9245

.6908

-

.9464

.7474

-

region

.9752

.8556

2.46e-04

.9732

.8605

1.5e-03

no tails (full seq.)

.9361

.6938

.8621

.9395

.7160

.6259

max length (full seq.)

.9300

.6514

.2589

.9348

.6817

.1369

NR

Ā Ā Ā Ā Ā Ā 

full sequence

.9419

.7328

-

.9556

.7566

-

region

.9824

.8861

1.08e-05

.9861

.8944

2.2e-05

no tails (full seq.)

.9575

.7438

.6640

.9602

.7486

.8507

max length (full seq.)

.9513

.7401

.8656

.9528

.7595

.8696

  1. * p-value: signed-rank test on ROC50 scores against full sequence in the corresponding setting