Data-set
|
genes
|
samples
|
DP
|
k NN
|
WV
|
LDA
|
SVM
|
ML-s
|
ML-d
|
---|
BRCA1
|
3226
|
7 BRCA1-positive
|
21/22
|
18/22 (1)
|
18/22
|
18/22
|
18/22
|
19/22
|
16/22
|
| |
15 BRCA1-negative
| | | | | | | |
BRCA2
|
3226
|
8 BRCA2-positive
|
21/22
|
21/22 (1)
|
17/22
|
19/22
|
18/22
|
17/22
|
17/22
|
| |
14 BRCA2-negative
| | | | | | | |
PROS
|
12600
|
52 tumor tissue
|
93/102
|
90/102 (5)
|
61/102
|
92/102
|
93/102
|
64/102
|
50/102
|
| |
50 normal tissue
| | | | | | | |
PROS-OUT
|
12625
|
8 non-recurrence
|
15/21
|
12/21 (1)
|
12/21
|
13/21
|
14/21
|
13/21
|
13/21
|
| |
13 recurrence
| | | | | | | |
DLBCL-FL
|
6817
|
52 DLBCL
|
74/77
|
71/77 (7)
|
63/77
|
74/77
|
74/77
|
65/77
|
58/77
|
| |
25 FL
| | | | | | | |
ALL-AML
|
6817
|
27 AML
|
38/38
|
37/38 (3)
|
38/38
|
38/38
|
38/38
|
30/38
|
27/38
|
| |
11 ALL
| | | | | | | |
I-2000
|
2000
|
40 tumor colon tissue
|
61/62
|
59/62 (3)
|
58/62
|
61/62
|
61/62
|
59/62
|
58/62
|
| |
22 normal colon tissue
| | | | | | | |
- Columns indicate the algorithm used, rows the data-set. In each cell the number in the numerator specifies the number of left-out-samples that has been correctly classified by the corresponding algorithm. The value in the denominator is the total number of samples n. The k NN algorithm has a free parameter that needs to be determined – the number of neighbors k. To allow for a fair comparison, we have optimized this value for each of the databases using cross-validation [12]. The optimal resulting value is specified in parenthesis. In the ML classifier, we consider two cases: those where the two classes are assumed to have the same variance, and those where the variances are assumed to be different. These are referred to as ML-s (same) and ML-d (different).