Fig. 4

Mean misclassification rates from the somatic CNA data set using the six methods considering different numbers of selected genes pred: 50, 100, 500 and 750. GLM and R-PLS denote the misclassification rates obtained from applying the GLM to the clinical data alone and PLS to the CNA data alone, respectively. LS-PCR denotes the approach derived from PCR, where CNA data are analyzed using PCA and IRLS can thus be applied to the merged data set of PCA scores and clinical data. LS-PLS-IRLS, R-LS-PLS, and IR-LS-PLS denote the misclassification rates obtained from the newly proposed LS-PLS approaches combining CNA and clinical data. For each method, a line is drawn to connect symbols to improve readability