From: A novel ensemble learning method for de novo computational identification of DNA binding sites
Program | # Nuisance Parameters | Motif Model | Search Strategy | Citation |
---|---|---|---|---|
Oligo analysis (RSAT) | 3 | cons | Exhaustive enumeration of short and bipartite oligos. Clusters overlapping motifs. Uses a binomial approximation to the hypergeometric score, similar to the overrepresentation objective function. | [14, 33, 34] |
Yeast Motif Finder (YMF) | 2 | cons | Exhaustive enumeration of short and bipartite oligos. Alphabet is {ACGTYR}. Uses the Normal approximation to the hypergeometric function, similar to the overrepresentation objective function. | [35] |
AlignAce (AA) | 2 | PWM | Gibbs sampling to optimize a Maximum a Posteriori (MAP) score. | [36] |
MotifSampler (MS) | 3–5 | PWM | Gibbs sampling with higher order Markov model. | [37] |
BioProspector (Biopros) | 7 | PWM | Gibbs sampling with higher order Markov model. Designed for long and bipartite motifs common in prokaryotes. | [16, 38] |
MEME | 4 | PWM | Expectation Maximization over a modified information content. | [39] |
Improbizer (Imp) | 8 | PWM | Expectation Maximization. Uses 2nd order Markov model and optionally accounts for positional restrictions using a Gaussian model. | [40] |
MITRA | 1 | mis | Tree-based search for long bipartite motifs with many mismatches. Uses a hypergeometric score similar to the overrepresentation objective function. | [41] |
wConsensus (wCons) | 1–13 | PWM | Greedy enumeration to maximize information content. Infers motif length. | [42] |
Weeder | 4 | mis | Bounded enumeration using a suffix tree. Tries all motif lengths from 6–12. | [43] |