Skip to main content

Table 3 Detection rates for the six gene set analysis methods on realistic simulated data sets.

From: Microarray-based gene set analysis: a comparison of current methods

Simulations based on diabetes data[3]

 

PCOT2

SAFE

GSEA-Category

GSEA-limma

Globaltest

sigPathway

On-Off [D]

0.997

0.637

1

0.587

1

0.773

 

(0.003)

(0.031)

(0)

(0.019)

(0)

(0.024)

On-On [D]

0.07

0.007

0.063

0

0.087

0.047

 

(0.013)

(0.005)

(0.015)

(0)

(0.017)

(0.012)

Off-Off [ND]

0.993

1

0.997

1

0.997

0.997

 

(0.003)

(0)

(0.002)

(0)

(0.002)

(0.002)

On-On [ND]

0.997

1

1

1

0.997

1

 

(0.003)

(0)

(0)

(0)

(0.003)

(0)

Simulations based on leukemia data[22]

 

PCOT2

SAFE

GSEA-Category

GSEA-limma

Globaltest

sigPathway

On-Off [D]

1

0.384

0.998

0.552

1

0.966

 

(0)

(0.031)

(0.002)

(0.021)

(0)

(0.007)

On-On [D]

0.5

0

0.444

0

0.584

0.18

 

(0.023)

(0)

(0.022)

(0)

(0.023)

(0.017)

Off-Off [ND]

0.997

1

0.998

1

0.99

0.995

 

(0.001)

(0)

(0.001)

(0)

(0.003)

(0.002)

On-On [ND]

0.994

1

0.99

1

0.99

0.996

 

(0.003)

(0)

(0.004)

(0)

(0.004)

(0.003)

  1. 100 data sets were simulated, with gene numbers and gene set membership determined by data-specific values derived from the diabetes [3] and leukemia [22] data sets. The simulated data sets were analyzed by each gene set analysis method, with 2,000 permutations used to generate p-values to which FDR controlling adjustments were made. An adjusted p-value of 0.05 was required for significance. The value in each cell relates to the proportion of each type of gene set activity correctly identified by each method. Standard errors are shown in parentheses.