Skip to main content

Table 1 Drug2Gene data source statistics

From: Drug2Gene: an exhaustive resource to explore effectively the drug-target relation network

Source database

Genes/proteins

Drugs/compounds

Relations/interactions

Unique relations in DB

CGDCP [17]

6071

3115

169154

154532 (3.534%)

ChEMBL [1]

5115

746582

2830526

2519174 (57.617%)

CTD [18]

27314

11569

89094

70944 (1.623%)

DrugBank [5]

3726

7825

17321

7338 (0.168%)

IUPHAR [19]

114

1455

651

348 (0.008%)

MICAD [20]

249

68

70

55 (0.001%)

PDSP_Ki [21]

605

5256

22790

11505 (0.263%)

PharmGKB [22]

22677

3630

78317

73064 (1.671%)

TTD [23]

1518

2418

2599

1077 (0.025%)

Uniprot [24]

86605

3693

351189

342495 (7.833%)

Ligand Expo [25]

-

7516

32511

26249 (0.600%)

HGNC [26]

24726

-

-

-

PDBsum [27]

23745

-

-

-

ChEBI [28]

-

5015

-

-

NCBI PubChem Compound [29]

-

746546

-

-

NCBI PubChem Substance [29]

-

767740

-

-

PubChem Bioassay [30]

-

-

1124637

831359 (19.014%)

Unified relations shared among two or more DBs

-

-

-

334150 (7.642%)

Total counts from all source databases

202465

2312423

4945372

4372290 (100.00%)

NCBI Gene/Entrez Gene [15]

Used for Unification/Data Integration

Homology inferred relations from NCBI HomoloGene [15]

-

-

226513

-

Total relations in Drug2Gene including homology-inferred relations

-

-

5171885

4598803

  1. Drug2Gene is built of three entities: genes, compounds, and relations between them. The entities are extracted from 19 public source databases. Numbers of entities are listed by source database and by type of the extracted entity - genes, compounds, or relations. Most of the compound-target oriented (relational) databases provide all three types of entries in their flat files. There are also only gene-centered (HGNC, PDBsum, and NCBI Gene) and compound-centered (ChEBI, NCBI PubChem Compound, and NCBI PubChem Substance) source databases that provide only gene or compound entries. Their entities are linked by relational sources like Ligand Expo (links to proteins in PDBsum) and PubChem Bioassay (relations between genes in NCBI Gene and compounds/substances in NCBI PubChem Compound/Substance). Only entries participating in a relation are counted in this table. For data regardless participation in relations, see Additional file 1: Table S2. Dataset - November 2013. The last column shows the number of relations after data integration (and as percent of the total number of integrated relations) unique to each source database.