Skip to main content

Table 1 Data used to construct the base dataset, transfer learning dataset, and few-shot learning dataset

From: OptNCMiner: a deep learning approach for the discovery of natural compounds modulating disease-specific multi-targets

Dataset

Target gene

Target protein

Data source

Active compounds

Inactive compounds

Base dataset (actives \(>\) 5000)

ADORA2A

Adenosine receptor A2a

ExCAPE-DB

5077

591

BRCA1

Breast cancer type 1 susceptibility protein

ExCAPE-DB

8619

43,095

CNR1

Cannabinoid receptor 1

ExCAPE-DB

5125

397

DRD2

D(2) dopamine receptor

ExCAPE-DB

8037

40,185

HTR1A

5-hydroxytryptamine receptor 1A

ExCAPE-DB

6339

31,695

KCNH2

Potassium voltage-gated channel subfamily H member 2

ExCAPE-DB

5327

26,635

LMNA

Prelamin-A/C

ExCAPE-DB

14,533

72,665

OPRM1

Mu-type opioid receptor

ExCAPE-DB

5665

2872

SLC6A4

Sodium-dependent serotonin transporter

ExCAPE-DB

6912

370

TARDBP

TAR DNA-binding protein 43

ExCAPE-DB

12,193

60,965

TDP1

Tyrosyl-DNA phosphodiesterase 1

ExCAPE-DB

23,129

115,645

Transfer learning dataset (1000 > actives > 500)

ADRA2A

Alpha-2A adrenergic receptor

ExCAPE-DB

816

39

GRIN1

Glutamate receptor ionotropic

ExCAPE-DB

553

92

HTR3A

5-hydroxytryptamine receptor 3A

ExCAPE-DB

565

65

MINK1

Misshapen-like kinase 1

ExCAPE-DB

929

8

PKM2

Pyruvate kinase PKM

ExCAPE-DB

546

2730

POLK

DNA polymerase kappa

LIT-PCBA

772

3860

VDR

Vitamin D3 receptor

LIT-PCBA

884

4420

Few-shot learning dataset (100 > actives)

ADRB2

Beta 2 adrenergic receptor

LIT-PCBA

17

170

ESR

Estrogen receptor alpha

LIT-PCBA

13

130

IDH1

Isocitrate dehydrogenase

LIT-PCBA

39

390

MTOR

mammalian target of rapamycin complex 1

LIT-PCBA

97

970

OPRK1

Kappa opioid receptor

LIT-PCBA

24

5460

PPARG

Peroxisome proliferator-activated receptor gamma

LIT-PCBA

27

270

TP53

Cellular tumor antigen p53

LIT-PCBA

79

790