Skip to main content

Table 2 Ability to discover TFB motif from ChIP-seq data sets

From: A Monte Carlo-based framework enhances the discovery and interpretation of regulatory sequence motifs

Motif-finding strategy

TP

TN

FP

FN

Sensitivity

Specificity

F-measure

Theoretical best

28

8

0

0

1.00

1.00

1.00

MEME ZOOPS

5.5

8

0

22.5

0.20

1.00

0.33

Iterative MEME/MAST MC

16

8

0

12

0.57

1.00

0.73

Gibbs recursive

15

8

0

13

0.54

1.00

0.70

Iterative Gibbs/MAST MC

18

8

0

10

0.64

1

0.78

Iterative Gibbs/MAST MC +Ideal

23

8

0

5

0.82

1

0.90

  1. Five groups of 7 sequence data sets were constructed from the putative binding sites derived from TfbB, TfbD, and TfbG ChIP-seq experiments. Various lengths of sequence taken surrounding each site (60bp, 100bp, and 200bp) were examined, as were stretches of sequence 60 base pairs long displaced a distance of 60 base pairs from ChIP-seq sites (displaced), and data sets built from randomly shuffling binding sites from the 7 different TFB binding site groups into new groups of equal size (shuffled). A data set of 126 60-bp segments from the Hb sp. NRC-1 genome was generated as an additional control (random). Evaluation of these 36 data sets with 4 alternative motif-finding strategies revealed distinct differences in the ability of each strategy to discover the putative TFB motif. For all data sets except the random and displaced data sets, discovery of a strong match to the TFB motif was scored as a true positive (TP). Failure to discover a TFB motif could from the random and displaced datasets was scored as a true negative (TN). If a weak match to the TFB motif was discovered in any dataset other than a random or displaced data set, it was scored as a half TP (0.5). One hundred runs were carried out for the Iterative MEME/MAST MC and Iterative Gibbs/MAST MC runs. A number of ‘ideal’ seeds were artificially created and were found to converge to the TFB motif. Given a very large number of runs, an ideal seed or near-ideal seed is expected to occur by chance, so the TFB motif would be recovered in these cases. For both MEME and the Gibbs recursive sampler, the application of the MotifCatcher extension significantly improved each finder’s performance.