Skip to main content

Table 1 Contingency table for each motif used for MI calculation

From: DiNAMO: highly sensitive DNA motif discovery in high-throughput sequencing data

 

positive dataset

negative dataset

presence of the motif m

a

b

absence of the motif m

c

d

  1. a (resp. b) represents the number of sequences of (resp. \(\mathcal {N}\)) that contain at least one occurrence of the motif m. c (resp. d) represents the number of other sequences in \(\mathcal {P}\) (resp. \(\mathcal {N}\)). Those four values are used to estimate the joint probabilities P(O i ,C j ) as well as the marginal probabilities P(O i ) and P(C j )