From: Predicting tissue specific cis-regulatory modules in the human genome using pairs of co-occurring motifs

Method overview. The diagram illustrates the workflow of the system. During training, the system contrasts sequences from the mixed set to control sequences to identify motif pairs that are enriched in the mixed set. The system identifies and scores sequences that include at least one of the enriched pairs. A Bayesian classifier is trained on the scores to distinguish candidate sequences in the mixed set from candidates in the control set. During validation, the list of pairs and the trained classifier are used to classify sequences in the validation set. The training and the validation are repeated to find the parameters that result in the best performance on the validation set. Finally, CrmMiner is tested on sequences in the testing set.

