Skip to main content

Table 1 Overview of the extent of overlapping gene sets in some commonly used gene set databases

From: Avoiding the pitfalls of gene set enrichment analysis with SetRank

DB

# sets

median size

% overlap

Min.

p25

Median

p75

Max.

BIOCYC

596

3

3.4%

1.45%

11.61%

20.00%

38.46%

100.00%

GOBP

14524

6

18.2%

0.01%

0.31%

0.76%

1.75%

100.00%

GOCC

1751

7

15.8%

0.01%

0.16%

0.49%

1.45%

100.00%

GOMF

4388

3

6.1%

0.01%

0.17%

0.50%

1.53%

100.00%

KEGG

956

29

13.1%

0.07%

1.61%

4.12%

8.98%

100.00%

Pathway Interaction Database

186

32.5

50.5%

0.52%

1.75%

3.45%

6.38%

63.64%

REACTOME

1784

19

11.8%

0.04%

1.00%

2.44%

6.29%

100.00%

WikiPathways

239

32

26.5%

0.24%

1.31%

2.61%

5.34%

100.00%

  1. Only gene sets with three or more genes were considered. The % overlap column indicates the percentage of gene set pairs sharing at least one gene. The column Min., p25, Median, p75 and Max. list the minimum, 25th percentile, median, 75th percentile and maximum Jaccard values for all pairs of intersecting gene sets