Skip to main content

Table 4 Average protein amount of training data in cross-validation (80% of the total amount) for redundant and non-redundant datasets in different ontologies

From: GODoc: high-throughput protein function prediction using novel k-nearest-neighbor and voting algorithms

Type

Dataset

# of redundant

# of non-redundant

Reduction ratio (%)

BPO

CAFA2-Swiss

  32,582

  22,231

  31.77

CAFA3-Swiss

  40,650

  27,158

  33.19

CCO

CAFA2-Swiss

  32,457

  22,521

  30.61

CAFA3-Swiss

  39,462

  26,631

  32.51

MFO

CAFA2-Swiss

  20,845

  14,711

  29.43

CAFA3-Swiss

  28,267

  19,254

  31.89