Skip to main content
Figure 3 | BMC Bioinformatics

Figure 3

From: A linear classifier based on entity recognition tools and a statistical approach to method extraction in the protein-protein interaction literature

Figure 3

Comparison of entity count features for NLProt protein and OSCAR compound mentions in abstracts in training set D (top), and ABNER Protein mentions in figure captions and PSI-MI method mentions in full text documents in training data DPMC (bottom). The horizontal axis represents the number of mentions x and the vertical axis the probability of documents with at least x mentions. The green line denotes probabilities for documents labeled relevant p P (n π x), while the red line denotes probabilities for documents labeled irrelevant p N (n π x); the blue line denotes the difference between green and red lines (|p P p N |).

Back to article page