### Selecting the gene set, distance measure, score formula in the training sample

For each score formula *S* and each distance measure *D*, the classification rule selects a gene set *G*. To simplify notation, let *F*
_{TR{a,b}} = (*C*
_{
TR
},*G* = {*a,b*}, *D, S*), where *a* and *b* refer to different genes. To make computations tractable, the classification rule starts with a preliminary filter that selects the 50 genes in the training sample with the highest values of *AUCS*(*Z*
_{
TR
}, *F*
_{TR{j}}), where *j* indexes genes. Subsequent calculations involve these 50 genes.

The greedy algorithm selects the gene *a* with the highest *AUCS*(*Z*
_{TR,}
*F*
_{TR{a}}) and identifies the gene *b* with the highest *AUCS*(*Z*
_{
TR
}, *F*
_{TR{a,b}}). If the increase in AUC is less than 0.02, G = {*a*}; otherwise *G* includes {*a,b*} and this procedure continues for additional genes.

The wrapper algorithm involves five random splits, each with 50% of the training sample constituting a training-training sample (*TR*:*TR*) for formulating the classification rule and 50% constituting a training-test sample (*TR*:*TE*) for computing AUC. The algorithm selects the gene *a* with the highest *AUCS*(*Z*
_{TR:TR}, *F*
_{TR:TE{a}}) and identifies the gene *b* with the highest *AUCS*(*Z*
_{TR:TR}, *F*
_{TR:TE{a,b}}). If the increase in AUC is less than 0.01, *G* = {*a*}; otherwise *G* includes {*a,b*} and this procedure continues additional genes. The wrapper selects the best classification rule, in terms of AUC, among the random splits.

For each *S* with *G* already selected, the classification rule selects *D* = 1 if the increase in AUC for *D* = 2 is less than 0.02 and *D* = 2 otherwise. For each *S* with *D* and *G* already selected, the classification rule selects *S* with the highest AUC.

### Computing ROC and RU curves in the test sample

Let

*F*
_{
TR
} denote the components of the final classification rule derived from the training sample. Let

*z*
_{TE(ijk)} denote the gene expression level for gene

*j* in specimen

*i* of class

*k* of the test sample, and let

*Z*
_{TE(ik)} = {

*z*
_{
TE
}(

*ijk*)}. Let

*n*
_{TE(k)} denote the number of specimens in class

*k* of the test sample. At each cutpoint

*u*, which corresponds to a decile of the combined distribution of gene expression levels over the two classes, the true positive rate (

*TPR*) is the fraction of specimens from class 0 classified as 0, and the false positive rate (

*FPR*) is the fraction of specimens from class 1 classified as 0,

For Goal 1, confidence intervals are computed by bootstrapping the data in the test sample 20 times. For each bootstrap sample, *TPR* at *FPR* = 0.1, 0.2, ..., 0.9 is computed via linear interpolation. The ROC curve plot for the bootstrap iterations consists of the mean ROC curve and upper and lower bounds based on the standard deviation of the ROC curves. An RU curve is computed from the concave envelope of the mean ROC curve, where the risk thresholds are derived from the slopes of the ROC curve. If the concave ROC curve has only one point between (0,0) and (1,1), there are insufficient data to compute a RU curve.