Skip to main content
Fig. 3 | BMC Bioinformatics

Fig. 3

From: The parameter sensitivity of random forests

Fig. 3

Prediction accuracy is a strong function of parameterization in high p/n studies. Summary of the predicted votes for the combined validation data for each fitted random forest model (n = 1000). A barplot for AUC scores is provided at the top indicating the relative performance of each model, represented by each column. Each model was fitted from a unique combination of n tree (n = 10), m try (n = 10) and sampsize parameters (n = 10) and their respectively outcomes (votes) for each sample or row (n = 186). Votes are provided in values from 0–1 with 0 representing a “no death” event and 1 representing a “death” event. All columns are ordered in descending order of AUC scores and rows are ordered in descending order of the fraction of correct votes for a given sample (total votes for the true sample class/all votes). All samples were subsetted according to the true class labels “death” and “no death”, though the votes may not be reflective of this. On the right of the main heatmaps are respective barplots for vote fractions and a heatmap of parameter values is present at the bottom of the figure. The n tree parameter is illustrated in blue, m try in magenta and sampsize in orange. Lighter hues represent lower values with darker hues indicating higher values. To the right of this is a scatterplot illustrating Spearman's correlations of each parameter with the AUC scores; positive correlations were observed for the parameters n tree , m try , and sampsize (ρ = 0.222, p < 10−10; ρ = 0.238, p < 10−12; ρ = 0.207, p < 10−9, respectively)

Back to article page