Skip to main content
Figure 1 | BMC Bioinformatics

Figure 1

From: Gene set bagging for estimating the probability a statistically significant result will replicate

Figure 1

Replicability assessed from the simulations. Simulation 1. Observed gene set p-values based on the (A) hypergeometric and (B) Wilcoxon Rank tests and then subsequent replication probabilities were calculated. The x-axis is the proportion of observed p-values that are less than 0.05 for each gene set and the y-axis is the average replication probability for that gene set. Spearman correlations were calculated to avoid issues with non-linearity. Simulation 2. The gene set p-values p l and replication probabilities R ̀‚ l were calculated for each data set, where 100 pairs of data sets with common differentially expressed genes were simulated. The Spearman correlation of the gene set p-values p l , l=1,…,L was calculated for each pair of datasets, and analogously for the replication probabilities R ̀‚ l . The 100 resulting correlations of gene set p-values or replication probabilities for (C) all gene sets and (D) those significant in either paired dataset at p<0.05. The replication probability offers better correlation between independent datasets for significant gene sets, but similar correlation across all significant and non-significant gene sets, than the p-value for the hypergeometric test.

Back to article page