Skip to main content

Table 1 RTX-KG2 integrates 70 knowledge sources into a single graph. Each row represents a server site from which sources were downloaded.

From: RTX-KG2: a system for building a semantically standardized knowledge graph for translational biomedicine

Name

#

Description

Format

Method

Biolink [49, 50]

1

Biolink model (semantic layer)

TTL

RBM

ChEMBL [14, 69]

1

EMBL chemogenomic database

SQL

D2J

DGIdb [70]

1

Drug gene interaction database

TSV

D2J

DisGeNET [71]

1

Disease-gene associations

TSV

D2J

DrugBank [13]

1

Pharmaceutical knowledge base

XML

D2J

DrugCentral [72]

1

Online drug compendium

SQL

D2J

Ensembl Gene [73]

1

Ensembl human gene annotations

JSON

D2J

EFO [74]

1

Experimental Factor ontology

OWL

RBM

GO [75, 76]

1

Gene ontology annotations

TSV

D2J

HMDB [77,78,79,80]

1

Human metabolite database

XML

D2J

IntAct [81, 82]

1

IntAct molecular interaction database

TSV

D2J

Jensen Lab Diseases [83]

1

Gene to diseases dataset

TSV

D2J

KEGG [11, 84, 85]

1

Kyoto encyclopedia of genes and genomes

API

D2J

miRBase [86,87,88,89,90]

1

MicroRNAs dataset

DAT

D2J

NCBI Gene [91]

1

NCBI human gene annotations

TSV

D2J

OBO Foundry

21

OBO foundry ontologies (Additional file 1: Table S1)

OWL

RBM

Orphanet [92]

1

Orphanet rare disease ontology

OWL

RBM

PathBank [93,94,95]

1

Wishart lab pathway databases

XML

D2J

Reactome [96]

1

Pathway database

SQL

D2J

SemMedDB [26]

1

Semantic MEDLINE database

SQL

D2J

SMPDB [16, 17]

1

Small molecule pathway database

CSV

D2J

UMLS [97]

26

Unified medical language system (Table 7)

TTL

RBM

UniChem [98]

1

EBI small molecule cross-refs

TSV

D2J

UniProtKB [15]

1

UniProt knowledge base

DAT

D2J

Total

70

   
  1. Columns as follows: Name, the short name(s) of the knowledge sources obtained or the distribution name in the cases of UMLS and OBO Foundry; #, the number of individual sources or ontologies obtained from that server; Format, the file format used for ingestion (see below); Method, the ingestion method used for the source, either D2J for direct-to-JSON or RBM for the RDF-based method. File format codes: CSV, comma-separated value; DAT, SWISS-PROT-like DAT format; JSON, JavaScript object notation; OWL, OWL in RDF/XML [67] syntax; RRF, UMLS Rich Release Format [68]; SQL, structured query language (SQL) dump; TSV, tab-separated value; XML, extensible markup language. Other abbreviations: NCBI, National Center for Biotechnology Information; EMBL, European Molecular Biology Laboratory