Skip to main content
Fig. 1 | BMC Bioinformatics

Fig. 1

From: Parallel sequence tagging for concept recognition

Fig. 1

Occurrence counts (y axis, log scale) of the most frequent bio entities in a large subset of PubMed, ordered by their rank (x axis). The documents were automatically annotated by a dictionary-based tagger (OGER). High-frequency false-positives were manually removed. The plot shows that a small number of frequent entities accounts for a majority of the occurring mentions, resembling a Zipfian distribution (see also [51, p. 569])

Back to article page