Gene selection methods using class information. The two supervised gene selection methods T2 (t-statistic) and Mdiff (difference in log2-ratios) try to identify genes that are differentially expressed between the two (known) classes. The boxplots in this figure show the mean adjusted Rand values between the clusterings computed after gene selection using T2 or Mdiff. The distributions represented by the boxplots are based on 80 (64) cluster analyses for the gene selections choosing 100 genes or less (number in parenthesis is for 1000 genes or more). The cluster analyses consist of combinations of the following sub-processes: normalizations norm.pt, norm.pt.bkg, norm.glob and norm.glob.bkg; standardization and nor standardization; missing value imputation by ROW and SVD; clustering methods hclust.corr.ward, hclust.eucl.ward, hclust.manh.ward, kmeans and Mclust (for 100 genes or less). The horizontal lines show the median (dashed line) and 95-percentile (dotted line) for the distribution of aRand values for random classifications.