Skip to main content

Table 1 Statistics of biochemical datasets.

From: A subgraph isomorphism algorithm and its application to biochemical data

 

Min Vertices

Min Edges

Max Vertices

Max Edges

Avg (SD) Vertices

Avg (SD) Edges

Avg (SD) Degree

Total Labels

Avg (SD) Labels

AIDS

Small Sparse

4

8

245

500

44.98

(21.68)

93.91

(45.05)

4.17

(2.28)

62

4.36

(0.86)

PDBSv1

Large Sparse

240

480

33067

61546

5663.6

(6954.82)

86661.27

(12365.7)

3.21

(2.52)

14

5.9

(1.04)

PDBSv2

Medium Sparse

1683

3414

7979

16302

3614.1

(1772.06)

7386.2

(3814.08)

4.08

(17.47)

13

4.63

(0.76)

PDBSv3

Small Dense

7

16

883

18832

376.86

186.66

8679.48

3814.08

44.78

(17.47)

21

18.86

(3.48)

Graemlin

Medium Dense

1081

12961

6726

230468

3167.6

(1568.66)

87759.6

(75939.2)

48.14

(63.61)

31676

3167.6

(1568.66)

PPI

Large Dense

5720

51464

12575

332458

7827.1

(2120.15)

107135

(82730.9)

28.66

(47.44)

78271

7827.1

(2120.15)

  1. Statistics of the number of vertices and number of edges. These describe the minimum, maximum and average number of vertices and edges in the dataset. Total Labels is the total number of labels in the dataset. Avg Label is the average number of labels per graph. Standard deviations are reported in parentheses.