Skip to main content

Table 5 Prediction performance on molecular function classes, over the dataset of textless proteins.

From: Protein Function Prediction using Text-based Features extracted from the Biomedical Literature: The CAFA Challenge

Function # Textless Proteins Text-KNN (Textless) Text-KNN (Cross-validation)
   P R F P R F
GO:0005488 58 0.82 0.47 0.59 0.65 0.88 0.75
GO:0003824 9 0.29 0.56 0.38 0.52 0.23 0.32
GO:0030528 1 0.04 1.00 0.08 0.44 0.24 0.31
GO:0005215 5 0.50 0.20 0.29 0.59 0.38 0.46
GO:0060089 7 0.44 0.57 0.50 0.39 0.16 0.22
GO:0005198 2 0.00 0.00 0.00 0.04 0.01 0.01
  1. Prediction performance of Text-KNN on proteins that have no associated text is shown in the Text-KNN (Textless) column. As a point of reference, the average cross-validation results, denoted as Text-KNN (Cross-Validation) as obtained over the whole cross-validation dataset, are shown for comparison only. The columns P, R, and F refer, respectively, to the Precision, Recall, and F-measure of the classifier over individual GO categories. A precision and recall values of 0 on a class indicates that all the proteins belonging to that class are misclassified into another class.