Skip to main content

Table 1 Dynamic-KNN coverage in partial model with respect to different distance thresholds and voting weight schemes in the cross-validation validation data set. # of seqs: total number of proteins in the set. Distance: distance threshold used in Dynamic-KNN. # of preds: number of predicted proteins and its corresponding proportion in %

From: GODoc: high-throughput protein function prediction using novel k-nearest-neighbor and voting algorithms

Type

Dataset

# of seqs

Distance

    Inverse

   FunOverlap

# of preds

 %

# of preds

 %

BPO

CAFA2-Swiss

8146

Q1

 2046

25.12

 1882

23.10

Q2

 4095

50.27

 3654

44.86

Q3

 6112

75.03

 5167

64.43

CAFA3-Swiss

10,163

Q1

 2562

25.21

 2309

22.72

Q2

 5095

50.13

 4470

43.98

Q3

 7601

74.79

 6333

62.32

CCO

CAFA2-Swiss

8114

Q1

 2039

25.13

 1855

22.86

Q2

 4034

49.72

 3540

43.63

Q3

 6042

74.46

 4898

60.36

CAFA3-Swiss

9866

Q1

 2548

24.91

 2204

22.34

Q2

 4922

49.89

 4261

43.19

Q3

 7357

74.57

 5912

59.92

MFO

CAFA2-Swiss

5211

Q1

 1291

24.77

 1204

23.11

Q2

 2593

49.76

 2366

45.40

Q3

 3902

74.88

 3405

65.35

CAFA3-Swiss

7017

Q1

 1756

25.02

 1630

23.23

Q2

 3518

50.14

 3185

45.39

Q3

 5278

75.22

 4573

65.17