Skip to main content

Table 3 Gene expression breast cancer data sets

From: Improved shrunken centroid classifiers for high-dimensional class-imbalanced data

Data set

# genes

Classification task

n 1

n 2

n 3

k m i n a

Ivshina

22,283

ER- or ER+

34

211

 

0.14

  

Grade 1, 2 or 3

68

166

55

0.19

  

Grade 1, 2 or 3

40

10 to 80

40

0.25 to 0.50

Wang

22,283

Relapse or not

179

107

 

0.37

Korkola

9,524

Good or bad prognosis

34

21

 

0.38

Sotiriou

7,650

ER+ or ER-

10 to 50

10

 

0.50 to 0.17

  

Grade 1-2 or 3

10 to 40

10

 

0.50 to 0.20

  1. a k min is a proportion of the minority class samples in the training set.