Skip to main content
Fig. 1 | BMC Bioinformatics

Fig. 1

From: Classification based on extensions of LS-PLS using logistic regression: application to clinical and multiple genomic data

Fig. 1

Boxplot of the misclassification rates (left part) and AUCs (right part) from the 100 simulated data sets. The results were obtained using the six methods and according to different κmax: (A1, A2): κmax=1; (B1, B2): κmax=4; (C1, C2): κmax=8. GLM and R-PLS denote the misclassification rates and AUCs obtained from applying the GLM to the clinical data alone and PLS to gene expression data alone, respectively. LS-PCR denotes the approach derived from PCR, where gene expression data are analyzed using PCA and IRLS can thus be applied to the merged data set of PCA scores and clinical data. LS-PLS-IRLS, R-LS-PLS, and IR-LS-PLS denote the misclassification rates and AUCs obtained from the newly proposed LS-PLS approaches combining expression and clinical data. For clarity, we use a color code to indicate the predictions: pink when from clinical data alone, purple when from expression gene data alone and blue for the results of methods combining both types of variables. The number of gene expression variables to pre-select pred is set to 500 in the SIS procedure

Back to article page