Skip to main content
Fig. 4 | BMC Bioinformatics

Fig. 4

From: The parameter sensitivity of random forests

Fig. 4

Importance ranks can be sensitive to parameter changes in low p/n studies. Summary of the variable importance ranks for each sequencing metric (n = 15). An AUC plot is provided at the top indicating the relative performance of each model, represented by each column. Each model was fitted from a unique combination of n tree (n = 10), m try (n = 15) and sampsize parameters (n = 10) and their respectively outcomes (importance value) for each metric. Each column of the main heatmap corresponds to a model's importance values, and were ranked from 1–15, where 1 represented the most important feature and 15 the least. The importance values were ordered according to previously calculated AUC scores using predicted vote and true class labels. Each row represents a metric and are ordered according to the mean rank of its importance values. The importance values were simplified in the main heatmap and illustrate four groups only. Blue indicates a rank of 1, green a rank of 2, gold a rank of 3, and beige a rank of 4 and greater. A summary of overall rank groups for a particular metric are illustrated in a barplot on the right of the main heatmap and a covariate heatmap with all parameter combinations is illustrated at the bottom of the plot. The n tree parameter is illustrated in blue, m try in pink and orange for sampsize in orange. Some parameters demonstrate robust behaviour to parameter changes such as “Uncollapsed coverage” and “% bases ≥ 50 quality”, which were ranked between 11–15 inclusive in 96 % and 95 % of all samples, respectively. These variables possessed VIMs that suggested they were less influential on classification accuracy. Yet, “Average reads/starts” was insensitive to parameter changes and was considered the most important variable. Another variable “Clusters” was parameter sensitive, illustrating that variables vary in their sensitivity to parameter changes which can ultimately influence classification accuracy

Back to article page