Effect of performing variable selection and increasing the number of variables. The figure shows the predictive accuracy for Class 1 (PA1), varying the proportion of samples from Class 1 in the training set (), for three classifiers. Forty, 1000 or 10000 variables (p) were generated and 40 variables were selected and used to develop the classifiers. In the left panels the mean of p
= 20 variables was different in the two classes, while in the right panels all the variables had different means (see Methods for details on data generation). The training set contained 80 samples, while the test set contained 20 samples and was balanced. Additional file 4 shows the results for all the classifiers.