Skip to main content

Table 1 Classification error and computation time for various clustering methods applied to simulated data.

From: Model-based clustering of DNA methylation array data: a recursive-partitioning algorithm for high-dimensional data arising as a mixture of beta distributions

Classification Error (%)
  J HC DynTree HOPACH (best) HOPACH (greedy) MM(1–6) RPMM (ICL-BIC) RPMM (BIC)
Case 1 25 33.2 44.7 9.9 16.4 12.6 15.5 15.4
  50 32.5 43.8 5.0 10.0 6.2 5.5 5.5
  500 33.9 38.4 3.5 11.3 1.5 0.1 0.1
  1000 34.0 38.5 9.2 14.4 1.1 0.1 0.1
Case 2 5 59.4 60.5 65.1 65.8 59.4 59.4 59.4
  10 58.9 60.0 66.9 67.5 59.2 59.2 59.2
  25 30.0 39.6 4.1 8.1 0.0 0.0 0.0
  50 29.9 39.6 3.6 6.4 0.3 0.3 0.3
Computation Time (seconds)
  J HC DynTree HOPACH (best) HOPACH (greedy) MM(1–6) RPMM (ICL-BIC) RPMM (BIC)
Case 1 25 0.00 0.04 4.15 1.18 36.39 13.80 13.83
  50 0.01 0.05 3.29 1.09 51.14 14.23 14.23
  500 0.03 0.08 2.98 1.04 436.82 90.99 91.05
  1000 0.06 0.11 3.05 1.10 848.10 176.99 176.81
Case 2 5 0.00 0.04 2.80 1.21 29.73 5.14 6.09
  10 0.00 0.04 2.01 1.13 46.48 9.69 10.05
  25 0.00 0.01 3.33 1.23 34.56 8.85 8.86
  50 0.01 0.01 2.63 1.16 47.52 10.90 10.86
  1. HC = Hierarchical clustering
  2. DynTree = Hierarchical clustering with classes determined by dynamic tree cutting
  3. HOPACH(best) = HOPACH with 'best' number of classes
  4. HOPACH(greedy) = HOPACH with 'greedy' number of classes
  5. MM(1–6) = Beta mixture model fitting 1–6 classes sequentially
  6. RPMM (ICL-BIC) = Recursively partitioned mixture model employing ICL-BIC
  7. RPMM (BIC) = Recursively partitioned mixture model employing BIC
  8. J = Number of loci considered in analysis