Skip to main content

Table 2 Area Under the Curve (AUC) of ROC analyses for the FASTA, STRUCTAL, and SeqKernel distances for comparing protein sequences from CATH2833

From: String kernels for protein sequence comparisons: improved fold recognition

 

Distance

 

CATH fold ID

N a

FASTA b

STRUCTAL c

SeKernel(0.2,10) d

SeqKernel(0.0001,2) e

1.10.10

381

0.58

0.85

0.31

0.62

2.60.40

555

0.58

0.97

0.38

0.65

3.20.20

251

0.53

0.99

0.88

0.85

3.30.70

368

0.53

0.92

0.33

0.62

3.40.50

1278

0.53

0.92

0.77

0.77

All five folds

2833

0.54

0.93

0.69

0.76

  1. aNumber of proteins in the fold
  2. bAUC based on FASTA E-value
  3. cAUC based on STRUCTAL SAS score
  4. d and eAUC based on SeqKernel distance, with (β,k max )=(0.2,10) (d) and (β,k max )=(0.0001,2) (e)