Skip to main content

Table 2 Experimental results for all competing methods on the remote homology detection task using the mismatch(5,1) kernel.

From: Efficient use of unlabeled data for protein sequence classification: a comparative study

Ā 

neighborhood (no clustering)

clustered neighborhood

dataset

ROC

ROC50

p-value

ROC

ROC50

p-value

PDB

Ā Ā Ā Ā Ā Ā 

full sequence

.9389

.7203

-

.9414

.7230

-

region

.9698

.8048

.0075

.9705

.8038

.0020

no tails (full seq.)

.9379

.7287

.9390

.9378

.7301

.7605

max length (full seq.)

.9457

.7359

.4725

.9526

.7491

.3817

Swiss-Prot

Ā Ā Ā Ā Ā Ā 

full sequence

.9253

.6685

-

.9378

.7258

-

region

.9757

.8280

.0060

.9773

.8414

.0108

no tails (full seq.)

.9290

.6750

.9813

.9344

.6874

.5600

max length (full seq.)

.9185

.6094

.1436

.9223

.6201

.0279

NR

Ā Ā Ā Ā Ā Ā 

full sequence

.9475

.7233

-

.9544

.7510

-

region

.9837

.8824

1.7e-04

.9874

.8885

1.2e-04

no tails (full seq.)

.9554

.7083

.7930

.9584

.7211

.7501

max length (full seq.)

.9508

.7421

.7578

.9518

.7613

.9387

  1. * p-value: signed-rank test on ROC50 scores against full sequence in the corresponding setting