Skip to main content
Figure 2 | BMC Bioinformatics

Figure 2

From: Comprehensive analysis of correlation coefficients estimated from pooling heterogeneous microarray data

Figure 2

Mean differences across a pool ofN groups causes Simpson’s paradox.  (a) r xy was obtained from combining N groups of simulated data; simulation parameter ρ xy is the true correlation of the pair xy within each group i = 1,N for . All other simulation parameters are as follows: μxy,i = (i,N−(i−1)), −0.9 ≤ ρ xy  ≤ 0.9, n i  = 10, λ i  = λ, 0.01 ≤ λ ≤ 0.1, for i = 1,N, and 10 ≤ N ≤ 100. (b) Scatterplot of a pair xy obtained with the simulation parameters: μxy,i = (i,(11−i)), ρ xy  = 0.9 and n i  = 50 for i = 1,10 groups. This plot shows clearly that even though there is a positive trend within each of the 10 groups, the trend across the pool of 10 groups is negative (Simpson’s paradox).

Back to article page