Skip to main content
Fig. 3 | BMC Bioinformatics

Fig. 3

From: Predicting the pathogenicity of bacterial genomes using widely spread protein families

Fig. 3

Prediction performance before and after removing highly correlated features from the training set (excluding the validation set). A The percentage of pairs of features that have a correlation within a specific range, for different ranges. The labels on the ’x’ axis represent the middle of the relevant range, where each range width is 0.1. B Validation set results of the RF classifier trained using the 450 features selected in the first step, and the RF classifier trained using the set of 244 features obtained after removing highly correlated features in the second feature selection step

Back to article page