Skip to main content

Table 4 Average protein amount of training data in cross-validation (80% of the total amount) for redundant and non-redundant datasets in different ontologies

From: GODoc: high-throughput protein function prediction using novel k-nearest-neighbor and voting algorithms

Type Dataset # of redundant # of non-redundant Reduction ratio (%)
BPO CAFA2-Swiss   32,582   22,231   31.77
CAFA3-Swiss   40,650   27,158   33.19
CCO CAFA2-Swiss   32,457   22,521   30.61
CAFA3-Swiss   39,462   26,631   32.51
MFO CAFA2-Swiss   20,845   14,711   29.43
CAFA3-Swiss   28,267   19,254   31.89