Skip to main content
Figure 5 | BMC Bioinformatics

Figure 5

From: Lost in folding space? Comparing four variants of the thermodynamic model for RNA secondary structure prediction

Figure 5

"MacroState" grammar. The color code is identical to Figure 2. The basic structure of the "MacroState" grammar is inherited from the previous three grammars, but it has a more complex distinction of cases for dangling bases. "MacroState" has to consider all the different dangling situations as in "MicroState", but its search space is restricted to the k(n)-times smaller folding space of the input sequence. To achieve these contradicting goals, dangling alternatives do not exist as search space candidates but are implicitly examined within the evaluation algebra. The grammar has to ensure that a substructure is of a defined dangling type whenever its energy or partition function value is used in an algebra evaluation function. We know that any helix derivated from nodg has no unpaired bases to its left or right, while helices from edgl, edgr or edglr have exactly one unpaired base dangling from left, right or exactly two unpaired bases dangling from both sides, respectively. In all four cases, there is no unpaired base left for a further dangling. Care must be taken, where we can not be sure if e. g. the leftmost unpaired base of a block _dl derivation is free to dangle to some helix to its left. The unpaired base would be available for a dangling if we use ssadd, but is occupied in incl situations. This uncertainty is passed to every calling function, but with a clever grammar design we can at least ensure that its type does not change. For example every mc1 or mcadd2 derivation contains one or more helices with one or more unpaired bases at its 5' end and definitely no unpaired base at its 3' end. Furthermore mc2 and mcadd1 always have no unpaired bases to both sides, mc3 or mcadd4 have one or more unpaired bases only at its 3' end and finally mc4 or mcadd3 are known to have one or more unpaired bases to both ends. The benefit of these distinctions can be demonstrated with the multiloop functions mldl and mladl. The important base is the one that is directly left to the mc1 or mc2 substructure. In principle, it can either dangle to the left, that is the closing stem of the multiloop, or the right, that is the leftmost helix within the multiloop. Actually, for mldl our base of interest can only dangle to the left, because every mc1 derivation already has at least one further base in front of the first inner helix. For mladl we truly have an a mbiguous situation, where the base of interest could dangle to one of both sides. Please note that mldl and mladl correspond to two different dot-bracket structures. mldl handles macrostates of the type "((..." including microstates "((..." and "((d..", whereas mladl handles macrostates of type "((.((..." and includes the microstates "((.((...", "((d((...", and "((b((...". The mfe algebra function locally chooses the variant with the better free energy, even if a global analysis would reveal that the locally worse structure would become MFE in the end. This constitutes a rare case where the MFE structure may be missed. Our partition function algebra correctly keeps track of these situations.

Back to article page