DiNAMO: highly sensitive DNA motif discovery in high-throughput sequencing data

BMC Bioinformatics

Table 1 Contingency table for each motif used for MI calculation

	positive dataset	negative dataset
presence of the motif m	a	b
absence of the motif m	c	d

a (resp. b) represents the number of sequences of (resp. \(\mathcal {N}\)) that contain at least one occurrence of the motif m. c (resp. d) represents the number of other sequences in \(\mathcal {P}\) (resp. \(\mathcal {N}\)). Those four values are used to estimate the joint probabilities P(O_i,C_j) as well as the marginal probabilities P(O_i) and P(C_j)

ISSN: 1471-2105