From: Identifying genes that contribute most to good classification in microarrays
Authors | Training sample | Test sample | Random aspect | Results |
---|---|---|---|---|
Michiels et al, 2005 [2] | (1) Selected genes most correlated with prognosis, (2) Created nearest centroid classification rule. | Used | Test and training sample splits in entire data set. | (1) Misclassification rate for test samples (2) Frequencies of genes selected in training sample |
Ma et al, 2006 [7] | (1) Split into training-training sample and training-test sample, (2) Using cross-validation, maximized the binormal area under ROC curve as a linear function of genes; then selected genes with non-zero coefficients. | Used | Training-training and training-test samples (i.e. the cross-validation and evaluation is repeated) | (1) Area under ROC curve for test samples, (2) Frequencies of genes selected in training sample. |
Li et al, 2004 [8] | (1) Split into training-training sample and training-test sample, (2) Cross-validated classification tree to maximize fit. | Not used | Resampling for training-training samples and training test samples. | (1) Relevancy intensity, which equals frequencies of genes selected in training sample when weights equal 1. |
Proposed method | (1) Selected genes with highest individual, classification performance (2) Created classification rule using nearest centroid and score function. | Used | Test and training samples splits in entire data set. | (1) ROC curve and area under ROC curve for test samples with emphasis on comparing many versus few genes, (2) Frequencies of genes selected in training sample. |