Skip to main content

Table 1 Data used to construct the base dataset, transfer learning dataset, and few-shot learning dataset

From: OptNCMiner: a deep learning approach for the discovery of natural compounds modulating disease-specific multi-targets

Dataset Target gene Target protein Data source Active compounds Inactive compounds
Base dataset (actives \(>\) 5000) ADORA2A Adenosine receptor A2a ExCAPE-DB 5077 591
BRCA1 Breast cancer type 1 susceptibility protein ExCAPE-DB 8619 43,095
CNR1 Cannabinoid receptor 1 ExCAPE-DB 5125 397
DRD2 D(2) dopamine receptor ExCAPE-DB 8037 40,185
HTR1A 5-hydroxytryptamine receptor 1A ExCAPE-DB 6339 31,695
KCNH2 Potassium voltage-gated channel subfamily H member 2 ExCAPE-DB 5327 26,635
LMNA Prelamin-A/C ExCAPE-DB 14,533 72,665
OPRM1 Mu-type opioid receptor ExCAPE-DB 5665 2872
SLC6A4 Sodium-dependent serotonin transporter ExCAPE-DB 6912 370
TARDBP TAR DNA-binding protein 43 ExCAPE-DB 12,193 60,965
TDP1 Tyrosyl-DNA phosphodiesterase 1 ExCAPE-DB 23,129 115,645
Transfer learning dataset (1000 > actives > 500) ADRA2A Alpha-2A adrenergic receptor ExCAPE-DB 816 39
GRIN1 Glutamate receptor ionotropic ExCAPE-DB 553 92
HTR3A 5-hydroxytryptamine receptor 3A ExCAPE-DB 565 65
MINK1 Misshapen-like kinase 1 ExCAPE-DB 929 8
PKM2 Pyruvate kinase PKM ExCAPE-DB 546 2730
POLK DNA polymerase kappa LIT-PCBA 772 3860
VDR Vitamin D3 receptor LIT-PCBA 884 4420
Few-shot learning dataset (100 > actives) ADRB2 Beta 2 adrenergic receptor LIT-PCBA 17 170
ESR Estrogen receptor alpha LIT-PCBA 13 130
IDH1 Isocitrate dehydrogenase LIT-PCBA 39 390
MTOR mammalian target of rapamycin complex 1 LIT-PCBA 97 970
OPRK1 Kappa opioid receptor LIT-PCBA 24 5460
PPARG Peroxisome proliferator-activated receptor gamma LIT-PCBA 27 270
TP53 Cellular tumor antigen p53 LIT-PCBA 79 790