Skip to main content

Table 1 Dataset statistics

From: Predicting protein functions using incomplete hierarchical labels

Dataset

# Proteins

# FunCat labels

# GO labels

Avg ± Std(FunCat)

Avg ± Std(GO)

CollinsPPI

1620

176 (13320)

168 (22023)

8.22 ±5.60

13.59 ±8.28

KroganPPI

2670

228 (20384)

241 (32639)

7.63 ±5.81

12.22 ±8.83

ScPPI

5700

305 (36909)

372 (61048)

6.48 ±5.71

10.71 ±8.83

  1. #Proteins’ represents the number of proteins in a dataset, ‘ #FunCat Labels’ describes the number of distinct FunCat labels of these proteins and the number in the bracket represents the total number of FunCat labels on all these proteins, ‘ #GO Labels’ represents the number of distinct GO labels of these proteins and the number in the bracket represents the total number of GO labels on all these proteins, ‘Avg ±Std(FunCat)’ represents the average number of FunCat labels for a protein in a dataset and the standard deviation, ‘Avg ±Std(GO)’ represents the average number of GO labels for a protein in a dataset and the standard deviation.