Skip to main content

Table 1 Comparison of clustering methods using performance metrics

From: Bayesian hierarchical clustering for microarray time series data with replicates and outlier measurements

  # S. cerevisiae 1 # S. cerevisiae 2 # H. sapiens # E. coli
Clustering method clusts P C ¯ C ± stdev clusts P C ¯ C ± stdev clusts P C ¯ C ± stdev clusts P C ¯ C ± stdev
BHC-SE 13 0.68 ± 0.005 58 0.883 ± 0.003 6 0.75 ± 0.009 24 0.84 ± 0.003
BHC-C 9 0.66 ± 0.004 40 0.877 ± 0.002 2 0.55 ± 0.009 15 0.80 ± 0.003
SC-linear 7 0.60 ± 0.006 40 0.881 ± 0.002 4 0.69 ± 0.009 17 0.78 ± 0.004
SC-cubic 4 0.49 ± 0.005 22 0.852 ± 0.002 2 0.44 ± 0.010 8 0.67 ± 0.004
HCL 13* 0.53 ± 0.009 58* 0.881 ± 0.002 6* 0.66 ± 0.016 24* 0.68 ± 0.006
SSClust 13* 0.60 ± 0.008 58* 0.846 ± 0.003 6* 0.69 ± 0.015 24* 0.72 ± 0.010
CAGED 2 0.42 ± 0.042 6 0.606 ± 0.003 3 0.55 ± 0.020 2 0.47 ±0.005
MCLUST 8 0.60 ± 0.004 30 0.858 ± 0.002 6 0.75 ± 0.011 11 0.73 ± 0.004
Zhou 13* 0.60 ± 0.008 58* 0.853 ± 0.004 6* 0.75 ± 0.011 24* 0.74 ± 0.006
  # S. cerevisiae 1 # S. cerevisiae 2 # H. sapiens # E. coli
Clustering method clusts BHI ± stdev clusts BHI ± stdev clusts BHI ± stdev clusts BHI ± stdev
BHC-SE 13 0.70 ± 0.07 58 0.57 ± 0.03 6 0.62 ± 0.06 24 0.46 ± 0.06
BHC-C 9 0.73 ± 0.11 40 0.55 ± 0.03 2 0.78 ± 0.05 15 0.47 ± 0.04
SC-linear 7 0.69 ± 0.10 40 0.55 ± 0.02 4 0.66 ± 0.07 17 0.35 ± 0.03
SC-cubic 4 0.64 ± 0.02 22 0.53 ± 0.01 2 0.70 ± 0.03 8 0.32 ± 0.02
HCL 13* 0.50 ± 0.04 58* 0.56 ± 0.04 6* 0.52 ± 0.07 24* 0.44 ± 0.07
SSClust 13* 0.65 ± 0.03 58* 0.56 ± 0.02 6* 0.64 ± 0.05 24* 0.36 ± 0.03
CAGED 2 0.64 ± 0.02 6 0.52 ± 0.02 3 0.68 ± 0.04 2 0.21 ± 0.01
MCLUST 8 0.69 ± 0.02 30 0.55 ± 0.02 6 0.61 ± 0.06 11 0.47 ± 0.04
Zhou 13* 0.66 ± 0.03 58* 0.54 ± 0.02 6* 0.61 ± 0.06 24* 0.43 ± 0.07
  # S. cerevisiae 1 # S. cerevisiae 2 # H. sapiens # E. coli
Clustering method clusts log marginal likelihood clusts log marginal likelihood clusts log marginal likelihood clusts log marginal likelihood
BHC-SE 13 -3293 58 -3956 6 -633 24 -2497
BHC-C 9 -3356 40 -4294 2 -734 15 -2622
  1. Table 1 shows the average Pearson correlation Coefficient ( P C ¯ C ) and BHI score of the four data sets for the different clustering algorithms. Confidence intervals represent ± one standard deviation, calculated by performing a nonparametric bootstrap. For the number of clusters in the partition (# clusts),* denotes that the number has not been optimized by the algorithm, but fixed at the number obtained for BHC with squared exponential covariance. The clustering methods are explained in the Methods Section. The table also shows the log-marginal likelihoods, log (P(y|T)), for BHC-SE and BHC-C. The best values for each data set are in bold.