Skip to main content

Advertisement

Springer Nature is making SARS-CoV-2 and COVID-19 research free. View research | View latest news | Sign up for updates

Table 1 Overview of the data sets and the methods used in this study

From: Classification of microarrays; synergistic effects between normalization, gene selection and machine learning

Data set (D) Classes* No. of genes**
Alizadeh DLBCL (68), other samples (65) 7806 (7430)
Finak Epithelial (34), stromal tissue (32) 33491
Galland Invasive NFPAs (22), non- invasive NFPAs (18) 40475 (40291)
Herschkowitz High ER expression (58), low ER expression (46) 19718
Jones Cancerous samples (72), non-cancerous samples (19) 40233 (39746)
Sørlie High ER expression (55), low ER expression (18) 8033 (7734)
Ye Metastatic (65), non-metastatic (22) 8911
Normalization (No) Description  
No 0 Raw data  
No 1 Print-tip MA-loess, no background correction  
No 2 Print-tip MA-loess, background correction  
No 3 Global MA-loess, no background correction  
No 4 Global MA-loess, background correction  
Gene selection (G) Fixed parameters  
T-test Two-sided  
Relief Threshold = 0, nosample = # obs. in data set  
Paired distance Euclidian distance  
Number of genes (N) 2, 12, 22, 32, 42, 52, 62, 72, 82, 92, 100, 200, 300, 400, 150, 500, 600, 700, 800, 900, 1000  
Machine learning (M) Description, Fixed parameters Optimized parameters
DT Gini Decision tree, Splitting index = Gini  
DT Information Decision tree, Splitting index = Information  
NN One layer Neural Network, one hidden layer, decay = 0.001, rang = 0.1, maxit = 100 size = [2-5]
NN No layer Neural Network, no hidden layer, decay = 0.001, rang = 0.1, maxit = 100, skip = TRUE, size = 0  
SVM Linear Support Vector Machine, linear kernel, type = nu-scv, cross = 10, nu = 0.2, scaled = FALSE  
SVM Poly2 Support Vector Machine, polynomial kernel, deg 2, type = nu-scv, cross = 10, nu = 0.2, scaled = FALSE  
SVM Poly3 Support Vector Machine, polynomial kernel, deg 3, type = nu-scv, cross = 10, nu = 0.2, scaled = FALSE  
SVM Rb Support Vector Machine, radial basis kernel, type = nu-scv, cross = 10, nu = 0.2, scaled = FALSE sigma = [2-14, 214]
  1. Acronyms defined here are used throughout the paper. "Fixed parameters" in the methods were given fixed values, while "Optimized parameters" were optimized in the inner cross validation using a grid search. *The number of samples belonging to each class is given in parenthesis. **Dimensions after background corrected normalization (No 2 and No 4) are given in parenthesis.