Fig. 1From: Parallel sequence tagging for concept recognitionOccurrence counts (y axis, log scale) of the most frequent bio entities in a large subset of PubMed, ordered by their rank (x axis). The documents were automatically annotated by a dictionary-based tagger (OGER). High-frequency false-positives were manually removed. The plot shows that a small number of frequent entities accounts for a majority of the occurring mentions, resembling a Zipfian distribution (see also [51, p. 569])Back to article page