Skip to main content

Table 1 The real microarray data divided in train and test sets

From: Combining multiple hypothesis testing and affinity propagation clustering leads to accurate, robust and sample size independent classification on gene expression data

Datasets Attributes (nr of genes) Train set samples (class1:class2) Test set samples (class1:class2)
Amyotrophic lateral sclerosis (als) 22,283 6:6 12:3
Duchenne muscular dystrophy (dmd) 22,283 7:7 11:3
Juvenile dermatomyositis (jdm) 22,283 10:10 8:11
Limb-girdle muscular dystrophy type 2A ( lgmd2a) 22,283 7:7 11:3
Limb-girdle muscular dystrophy type 2B ( lgmd2b) 22,283 7:7 11:3
Nemaline myopathy (nm) 12,600 8:8 13:5
breast cancer (4348)24,481 44:34 7:12
colon cancer 7,129 15:15 7:25
all/aml leukemia 7,129 27:11 20:14
prostate cancer 12,600 52:50 25:9