Skip to main content

Table 6 Prediction performance on biological process classes, over the dataset of textless proteins.

From: Protein Function Prediction using Text-based Features extracted from the Biomedical Literature: The CAFA Challenge

Function

# Test Proteins

Text-KNN (Textless)

Text-KNN (Cross-validation)

  

P

R

F

P

R

F

GO:0065007

19

0.28

0.47

0.35

0.23

0.52

0.31

GO:0032502

18

0.19

0.22

0.21

0.22

0.19

0.20

GO:0009987

8

0.04

0.13

0.06

0.24

0.29

0.26

GO:0050896

20

0.38

0.30

0.33

0.25

0.16

0.19

GO:0008152

7

0.29

0.29

0.29

0.23

0.14

0.17

GO:0051234

9

0.33

0.33

0.33

0.32

0.20

0.25

GO:0016043

6

0.00

0.00

0.00

0.13

0.05

0.07

GO:0023052

3

0.00

0.00

0.00

0.18

0.11

0.14

GO:0032501

9

0.00

0.00

0.00

0.12

0.02

0.04

GO:0022414

7

0.00

0.00

0.00

0.51

0.15

0.24

GO:0051704

1

0.00

0.00

0.00

0.00

0.00

0.00

GO:0040011

3

0.00

0.00

0.00

0.00

0.00

0.00

GO:0002376

1

0.00

0.00

0.00

0.00

0.00

0.00

  1. Prediction performance of Text-KNN on proteins that have no associated text is shown in the Text-KNN (Textless) column. As a point of reference, the average cross-validation results, denoted as Text-KNN (Cross-Validation) as obtained over the whole cross-validation dataset, are shown for comparison only. The columns P, R, and F refer, respectively, to the Precision, Recall, and F-measure of the classifier over individual GO categories. A precision and recall values of 0 on a class indicates that all the proteins belonging to that class are misclassified into another class.