Skip to main content

Table 1 Overview of the extent of overlapping gene sets in some commonly used gene set databases

From: Avoiding the pitfalls of gene set enrichment analysis with SetRank

DB # sets median size % overlap Min. p25 Median p75 Max.
BIOCYC 596 3 3.4% 1.45% 11.61% 20.00% 38.46% 100.00%
GOBP 14524 6 18.2% 0.01% 0.31% 0.76% 1.75% 100.00%
GOCC 1751 7 15.8% 0.01% 0.16% 0.49% 1.45% 100.00%
GOMF 4388 3 6.1% 0.01% 0.17% 0.50% 1.53% 100.00%
KEGG 956 29 13.1% 0.07% 1.61% 4.12% 8.98% 100.00%
Pathway Interaction Database 186 32.5 50.5% 0.52% 1.75% 3.45% 6.38% 63.64%
REACTOME 1784 19 11.8% 0.04% 1.00% 2.44% 6.29% 100.00%
WikiPathways 239 32 26.5% 0.24% 1.31% 2.61% 5.34% 100.00%
  1. Only gene sets with three or more genes were considered. The % overlap column indicates the percentage of gene set pairs sharing at least one gene. The column Min., p25, Median, p75 and Max. list the minimum, 25th percentile, median, 75th percentile and maximum Jaccard values for all pairs of intersecting gene sets