| positive dataset | negative dataset |
---|
presence of the motif m | a | b |
absence of the motif m | c | d |
- a (resp. b) represents the number of sequences of (resp. \(\mathcal {N}\)) that contain at least one occurrence of the motif m. c (resp. d) represents the number of other sequences in \(\mathcal {P}\) (resp. \(\mathcal {N}\)). Those four values are used to estimate the joint probabilities P(O
i
,C
j
) as well as the marginal probabilities P(O
i
) and P(C
j
)