Skip to main content

Table 1 Classification accuracy of different sampling methods on unbalanced datasets

From: ILRC: a hybrid biomarker discovery algorithm based on improved L1 regularization and clustering in microarray data

 

Datasets

SVM

GB

DT

NN

KNN

Origin datasets

ALL1

1.0000

1.0000

0.9843

1.0000

1.0000

ALL3

0.8080

0.8640

0.7680

0.8240

0.8080

ALL4

0.9567

0.9053

0.8053

0.9129

0.8930

DLBCL

0.9875

0.9750

0.8558

0.9875

0.9750

Myeloma

0.9018

0.8965

0.7114

0.8842

0.8382

Over sampling

ALL1

1.0000

1.0000

1.0000

1.0000

1.0000

ALL3

0.8261

0.8271

0.8371

0.9307

0.7326

ALL4

0.9704

0.9254

0.8644

1.0000

0.8729

DLBCL

0.9482

0.9312

0.9141

0.9739

0.8790

Myeloma

0.8793

0.8613

0.8359

0.8687

0.8100

Combined sampling

ALL1

1.0000

1.0000

1.0000

1.0000

1.0000

ALL3

0.8766

0.7926

0.7918

0.9456

0.7616

ALL4

0.9704

0.9259

0.7610

0.9630

0.8667

DLBCL

0.9913

0.9913

0.8877

1.0000

0.9830

Myeloma

0.8943

0.8650

0.8250

0.8725

0.8100