Skip to main content
Figure 2 | BMC Bioinformatics

Figure 2

From: A linear classifier based on entity recognition tools and a statistical approach to method extraction in the protein-protein interaction literature

Figure 2

Comparison of entity count features for ABNER protein mentions in abstracts in training set D (top), and CHEBI compound names in full text documents in training data DPMC (bottom). The horizontal axis represents the number of mentions x, and the vertical axis the probability of documents with at least x mentions. The green lines denote probabilities for documents labeled relevant p P (n π x), while the red lines denote probabilities documents labeled irrelevant p N (n π x); the blue lines denote the difference between green and red lines (|p P p N |).

Back to article page