Skip to main content

Table 1 Results of the simulation study

From: A new unsupervised gene clustering algorithm based on the integration of biological knowledge into expression data

   

Coexpression indicator

Biological homogeneity indicator

Both

I

K

r

Heatmap

WGCNA

Integration

Heatmap

WGCNA

Integration

Heatmap

WGCNA

Integration

10

300

1

92.15

94.90

98.65

65.50

81.5

89.5

64.60

78.95

88.80

10

300

2

92.31

94.80

96.55

50.40

60.15

67.25

49.75

58.30

66.25

10

300

3

92.00

95.32

94.52

36.77

45.81

54.03

36.61

45.00

53.39

25

1000

1

88.70

99.12

91.33

7.67

28.00

45.44

7.35

27.09

44.72

25

1000

2

90.25

99.12

90.55

3.79

11.89

29.62

3.54

11.17

28.95

25

1000

3

89.00

98.99

85.67

1.94

3.55

18.66

1.80

3.34

18.06

  1. Results of the simulation study for the three clustering algorithms: Heatmap classification (Heatmap), clustering based on coexpression network (WGCNA) and our clustering algorithm (Integration). The simulated data sets vary according to the number of samples (I), the number of genes (K) and the intensity of randomness (r). We give the average proportion of clusters (%), among a given partition, which are significantly coexpressed (CI), biologically homogeneous (BHI) or both coexpressed and biologically homogeneous (Both). Let us take the example of simulated expression data sets with 10 individuals and 300 variables, associated with simulated GO annotations with an intensity of randomness of 1. On average the Heatmaps of these data sets provide partitions with 92.15% of significantly coexpressed clusters.