Skip to main content

Table 1 Problem-specific Datasets.

From: svm PRAT: SVM-based Protein Residue Annotation Toolkit

Problem

Source

Type

#C

#Seq

#Res

#CV

%

Disorder Prediction

DisPro [7]

Binary

2

723

215612

10

30

Protein-DNA Site

DISIS [6]

Binary

2

693

127240

3

20

Residue-wise Contact

SVM [15]

Regression

∞

680

120421

15

40

Local Structure

Profnet [35]

Multiclass

16

1600

286238

3

40

  1. #C, #Seq, #Res, #CV, and % denote the number of classes, sequences, residues, number of cross validation folds, and the maximum pairwise sequence identity between the sequences, respectively. 8 represents the regression problem.