Skip to main content

Table 5 Log-likelihoods of alignments, and log-posteriors of alignment annotations, for training and testing datasets under various EM convergence regimes in the PFOLD benchmark. The "mininc" parameter is the minimal fractional increase in the log-likelihood that is considered by our EM implementation to be an improvement, while the "forgive" parameter is the number of iterations of EM without such an improvement that will be tolerated before the algorithm terminates. The default settings are mininc = le-3, forgive = 0. Here D denotes the alignment data, A denotes the RFAM secondary structure annotations of the alignment data and θ denotes the model with parameters optimized for the training set using the specified EM convergence criteria.

From: XRate: a fast prototyping, training and annotation tool for phylo-grammars

Dataset

"mininc"

"forgive"

log2 P(D, A|θ)

log2 P(D|θ)

log2 P(A|D, θ)

Training set

le-3

0

-466330.6649

-453589.9251

-12740.7398

Training set

le-4

0

-465397.0642

-453403.7081

-11993.3561

Training set

le-5

0

-465397.0642

-453403.7081

-11993.3561

Training set

le-3

2

-465821.5239

-453476.0389

-12345.4850

Training set

le-3

4

-465565.9224

-453437.5353

-12128.3871

Training set

le-3

6

-465397.0642

-453403.7081

-11993.3561

Training set

le-3

8

-465291.1983

-453356.6841

-11934.5142

Training set

le-4

4

-465147.9174

-453318.4543

-11829.4631

Training set

le-4

10

-465010.8431

-453209.0744

-11801.7687

Test set

le-3

0

-360472.7960

-343832.6014

-16640.1946

Test set

le-4

0

-360190.7940

-344117.5123

-16073.2817

Test set

le-5

0

-360190.7940

-344117.5123

-16073.2817

Test set

le-3

2

-360148.9090

-343841.2775

-16307.6315

Test set

le-3

4

-360178.4500

-344016.2558

-16162.1942

Test set

le-3

6

-360190.7940

-344117.5123

-16073.2817

Test set

le-3

8

-360092.2930

-344078.8868

-16013.4062

Test set

le-4

4

-360057.4880

-344116.5923

-15940.8957

Test set

le-4

10

-360108.0100

-344166.2108

-15941.7992