Skip to main content
Figure 2 | BMC Bioinformatics

Figure 2

From: Building a protein name dictionary from full text: a machine learning term extraction approach

Figure 2

Relation between the precision of the disambiguation and score values. Ten score values were chosen randomly. For each of these values, we considered the 50 terms that had scores immediately greater than the value, and evaluated if the term referred to a gene or gene product in the article where the prediction was made. Precision of the prediction was calculated as the number of correct predictions over the total number of predictions (50). The 500 names which were checked represent a random sample of 21,501 names in the catalog. The Figure shows that predictions made with higher scores have a greater probability to be correct. The label shown for each evaluation point indicates the percentage of terms found in the evaluation corpus with a score above the score of this specific point.

Back to article page