Simulation results. Results for the simulation studies detailed in Section “Evaluation using simulated gene sets and simulated data”. For all plots, error bars represent ±1S
E for the mean value over all 1000 simulated datasets. a)-c) Results for the type I error simulation study based on MVN data generated with an identity population covariance matrix. This model is consistent with H
0. d)-f) Results for the power simulation study based on MVN data generated according to a single-factor population covariance matrix. Under this model, an association exists between the first gene set and PC 1. g)-i) Results for the power simulation study based on MVN data generated according to a two-factor population covariance matrix. Under this model, an association exists between the first gene set and PCs 1 and 2. a), d) and g) Mean p-values computed using the PCGSE method for the first simulated gene set relative to the first 5 PCs. b), e) and h) Mean weights used by the SGSE method to combine the PCGSE-computed p-values for each gene set relative to the first 5 PCs. PC variance weights are shown as round points connected by a solid line. PC variance scaled by the lower-tailed p-value computed using the Tracy-Widom distribution for the PC variance is shown using square points connected by a dashed line. c), f) and i) Quantile-quantile plot of the p-values computed using the SGSE method, with both PC variance weights (Var.) or weights defined by the PC variance scaled by the lower-tailed Tracy-Widom p-value of the PC variance (TW*Var.), or the benchmark method that uses a Chi-squared test between cluster membership and gene set membership (Chisq).