Skip to main content

Table 1 Summary of datasets used in experiments (sorted by the no. of subjects in ascending order)

From: A robust data scaling algorithm to improve classification accuracies in biomedical data

Dataset No. of subjects (pos/neg) Var. type No. of var. Task
GSE 27899IL [15] 10/10 DNA methylation 27578 diagnose ulcerative colitis
Prostate cancer [16] 14/9 microarray gene expression 15009 diagnose prostate cancer
Colon cancer [16] 15/11 microarray gene expression 15009 diagnose colon cancer
Lung cancer [16] 20/7 microarray gene expression 15009 diagnose lung cancer
Breast cancer [16] 17/15 microarray gene expression 15009 diagnose breast cancer
Leukemia [17] 11/27 microarray gene expression 7129 diagnose leukemia
GSE 29490 [18] 20/7 DNA methylation 26916 diagnose colorectal carchinoma
GSE 25869 [19] 14/9 DNA methylation 27570 diagnose gastric cancer
Breast tissue [20] 21/85 impedance measurements 9 diagnose breast tumor
LSVT [21] 42/84 wavelet and frequency based measurements 310 assessment of treatments in Parkinson
DLBCL [22] 88/72 microarray gene expression 715 diagnose DLBCL
Myeloma [23] 137/36 microarray gene expression 12625 diagnose bone lesions
Parkinsons [24] 147/48 vocal based measurements 22 diagnose Parkinson disease
Wdbc [25] 212/357 nuclear feature from image 30 diagnose breast tumor
Indian liver [26] 414/165 biochemistry based measurements 9 diagnose liver disease
Pima Indians diabetes [27] 268/500 clinical measurements 8 diagnose diabetes