Skip to main content

Table 4 Statistics of datasets used in experiments. The first section describes the datasets used for pre-training; the later two sections describe datasets for fine-tuning

From: Enhancing drug property prediction with dual-channel transfer learning based on molecular fragment

 

Dataset

#Molecules

Avg. #atoms

Avg. #bonds

#Tasks

Avg. degree

 

GEOM-Drug

304,466

44.40

46.40

–

2.09

Classification

BBBP

2,039

24.06

25.95

1

2.16

 

Tox21

7,831

18.57

19.29

12

2.08

 

ToxCast

8,576

18.78

19.26

617

2.05

 

SIDER

1,427

33.64

35.36

27

2.10

 

MUV

93,087

24.23

26.28

17

2.17

 

HIV

41,127

25.51

27.47

1

2.15

 

BACE

1,513

34.09

36.86

1

2.16

Reg.

ESOL

1,128

13.30

13.69

1

2.06

 

Lipophilicity

4,200

27.04

29.50

1

2.18

 

Malaria

9,999

30.36

33.20

1

2.19