An overview of the MTAP running procedure. A. All known binding positions G
in collected into upstream regions corresponding to each CDS in G
. B. A transformation function t(a, n) creates a test for binding protein an bases upstream of each CDS (note that the transcription start site is often unknown or not correctly annotated). C. Background probability information for the entire genome is collected by comparing the upstream regions from the entire genome (or ∀ k) to the foreground regions selected by t(a, n). D. Pipeline p runs each step of the proposed method M
. E. M
creates a marking on the sequences in B that is evaluated against all marked transcription factor binding positions in B to score the performance of M
in recovering binding sites for transcription factor a. This is then repeated for transcription factors b and c.