Skip to main content

Table 5 Prediction performance on molecular function classes, over the dataset of textless proteins.

From: Protein Function Prediction using Text-based Features extracted from the Biomedical Literature: The CAFA Challenge

Function

# Textless Proteins

Text-KNN (Textless)

Text-KNN (Cross-validation)

  

P

R

F

P

R

F

GO:0005488

58

0.82

0.47

0.59

0.65

0.88

0.75

GO:0003824

9

0.29

0.56

0.38

0.52

0.23

0.32

GO:0030528

1

0.04

1.00

0.08

0.44

0.24

0.31

GO:0005215

5

0.50

0.20

0.29

0.59

0.38

0.46

GO:0060089

7

0.44

0.57

0.50

0.39

0.16

0.22

GO:0005198

2

0.00

0.00

0.00

0.04

0.01

0.01

  1. Prediction performance of Text-KNN on proteins that have no associated text is shown in the Text-KNN (Textless) column. As a point of reference, the average cross-validation results, denoted as Text-KNN (Cross-Validation) as obtained over the whole cross-validation dataset, are shown for comparison only. The columns P, R, and F refer, respectively, to the Precision, Recall, and F-measure of the classifier over individual GO categories. A precision and recall values of 0 on a class indicates that all the proteins belonging to that class are misclassified into another class.