Skip to main content

Advertisement

Table 5 Log-likelihoods of alignments, and log-posteriors of alignment annotations, for training and testing datasets under various EM convergence regimes in the PFOLD benchmark. The "mininc" parameter is the minimal fractional increase in the log-likelihood that is considered by our EM implementation to be an improvement, while the "forgive" parameter is the number of iterations of EM without such an improvement that will be tolerated before the algorithm terminates. The default settings are mininc = le-3, forgive = 0. Here D denotes the alignment data, A denotes the RFAM secondary structure annotations of the alignment data and θ denotes the model with parameters optimized for the training set using the specified EM convergence criteria.

From: XRate: a fast prototyping, training and annotation tool for phylo-grammars

Dataset "mininc" "forgive" log2 P(D, A|θ) log2 P(D|θ) log2 P(A|D, θ)
Training set le-3 0 -466330.6649 -453589.9251 -12740.7398
Training set le-4 0 -465397.0642 -453403.7081 -11993.3561
Training set le-5 0 -465397.0642 -453403.7081 -11993.3561
Training set le-3 2 -465821.5239 -453476.0389 -12345.4850
Training set le-3 4 -465565.9224 -453437.5353 -12128.3871
Training set le-3 6 -465397.0642 -453403.7081 -11993.3561
Training set le-3 8 -465291.1983 -453356.6841 -11934.5142
Training set le-4 4 -465147.9174 -453318.4543 -11829.4631
Training set le-4 10 -465010.8431 -453209.0744 -11801.7687
Test set le-3 0 -360472.7960 -343832.6014 -16640.1946
Test set le-4 0 -360190.7940 -344117.5123 -16073.2817
Test set le-5 0 -360190.7940 -344117.5123 -16073.2817
Test set le-3 2 -360148.9090 -343841.2775 -16307.6315
Test set le-3 4 -360178.4500 -344016.2558 -16162.1942
Test set le-3 6 -360190.7940 -344117.5123 -16073.2817
Test set le-3 8 -360092.2930 -344078.8868 -16013.4062
Test set le-4 4 -360057.4880 -344116.5923 -15940.8957
Test set le-4 10 -360108.0100 -344166.2108 -15941.7992