Skip to main content

Table 1 Descriptions of benchmark datasets.

From: Differential prioritization between relevance and redundancy in correlation-based feature selection techniques for multiclass gene expression data

Dataset

Type

N

K

Training:Test set size

P max

BRN

cDNA

7452

15

176:84

150

BRN14

cDNA

7452

14

174:83

150

GCM

Affymetrix

10820

14

144:54

150

NCI60

cDNA

7386

8

40:20

150

PDL

Affymetrix

12011

6

166:82

120

Lung

Affymetrix

1741

5

135:68

100

SRBC

cDNA

2308

4

55:28

80

MLL

Affymetrix

8681

3

48:24

60

AML/ALL

Affymetrix

3571

3

48:24

60

  1. N is the number of features after preprocessing. K is the number of classes in the dataset.