Skip to main content

Table 1 Structural motif discovery algorithms.

From: Classification and assessment tools for structural motif discovery algorithms

Tool

Class

Website

FOLDALIGN [17]

EN

[40]

Based on Sankoff's algorithm. It maximizes alignment similarity and number of base pairs formed in 2 aligned sequences.

SLASH [20]

EN

NA

Uses FOLDALIGN to find local alignments in RNA sequences. Then COVE [41], to build a SCFG model from the local alignments.

Mauri & Pavesi [22]

EN

NA

Uses Affix trees for the discovery of hairpins, bulges and internal loops in RNA. Substrings of certain length appearing in at least q sequences are found and expanded.

Seed [23]

EN

[42]

Uses suffix arrays to induce motifs from the seed sequence. Data structures are used to store the seed sequence, its reverse, and the input sequences.

comRNA [24]

EN

[43]

Uses an n − partite undirected weighted connectivity graph to represent stems and their similarity. The problem of finding motifs is mapped to finding a set of maximum cliques. A graph technique similar to topological sort is applied to find the best assemblies of stems.

RNAmine [25]

EN

[44]

Uses a graph mining algorithm to find conserved stems.

RNAGA [28]

HU

NA

Genetic algorithm is applied at different levels. First it is applied on each sequence to get a set of stable structures. Then it is applied again to the set of stable structures.

GPRM [29]

HU

NA

Uses genetic programming. It requires two sets of inputs: a positive set and a negative set. Individuals are evaluated based on F-score and using the two input sets.

GeRNAMo [30]

HU

NA

GeRNAMo applies genetic programming on the output of RNAsubopt.

CMfinder [32]

HU

[45]

based on expectation maximization (EM) to simultaneously align and fold sequences using covariance model of RNA motifs.

RNAProfile [34]

HU

[46]

Uses a heuristic to extract a set of candidate regions from each sequence. The second step involves grouping regions to find similar motifs.

RNAPromo [33]

HU

[47]

The motif prediction algorithm initially looks for structural elements which are common to the input RNAs, and then employs an expectation maximization algorithm to refine the resulting probabilistic model.

  1. EN: enumerative and HU: heuristics.