Examples showing the difficulty with relying on correlation coefficients as performance measures. Predicted model quality scores are plotted against the observed combined model quality scores on a target-by-target basis, for models submitted by the automated fold recognition servers to the CASP7 tertiary structure category (AL and TS models are included). a) The scaled MODCHECK scores are compared with the ModSSEA scores for the target T0304 models. The Spearman's rank correlation coefficient (ρ) between the MODCHECK scores and observed model quality scores is 0.66 and the observed model quality of the top ranked model (m) is 0.27 (the data point is circled in blue). The correlation coefficient for the ModSSEA method is lower (ρ = 0.50), however the quality of the top ranked model is higher (m = 0.34) (the data point is circled in red). b) The ProQ scores are compared with the ModSSEA scores for the target T0283 models. For ProQ ρ = 0.50 is and m = 0.01, whereas for ModSSEA, ρ = 0.40 is and m = 0.48. c) The scaled MODCHECK scores are compared with the ModFOLD scores for the target T0289 models. For MODCHECK, ρ = 0.61 is and m = 0.13, whereas for ModFOLD ρ = 0.53 is and m = 0.47. d) The ProQ scores are compared with the ModFOLD scores for the target T0321 models. For ProQ, ρ = 0.48 is and m = 0.11, whereas for ModFOLD, ρ = 0.17 is and m = 0.24.