Skip to main content

Table 5 The method of sampling the posterior distribution of the MCMC chain by averaging random accepted models from the steady state was compared to the method of selecting the model with the overall maximum log likelihood.

From: Unsupervised statistical clustering of environmental shotgun sequences

     Order 3 model Order 4 model
Org 1 Org 2 Frag L Sampling type D 3 Accuracy LL D 4 Accuracy LL
Arthrobacter aurescens TC1 vs. Sinorhizobium meliloti 1021
   400 Steady state sampled 1.08 0.95 -1054490.36    
   1.09 0.94 -1040007.41      
   400 Maximum log likelihood 1.02 0.94 -1055584.16    
NC_003047 NC_008711 1000 Steady state sampled 1.95 0.97 -2648159.80    
   2.52 0.99 -2637429.69      
   1000 Maximum log likelihood 2.12 0.98 -2645204.57    
Lactococcus lactis subsp. cremoris MG1363 vs. Francisella tularensis subsp. holarctica FTA
   400 Steady state sampled 1.08 0.90 -1045063.72    
   1.33 0.95 -1040811.10      
   400 Maximum log likelihood 1.15 0.92 -1047966.99    
NC_009004 NC_009749 1000 Steady state sampled 2.02 0.96 -2624742.76    
   2.22 0.97 -2615376.71      
   1000 Maximum log likelihood 2.19 0.96 -2626080.18    
Helicobacter pylori HPAG1 vs. Streptococcus pneumoniae R6
   400 Steady state sampled 0.93 0.96 -1059955.55    
   1.18 0.93 -1045561.25      
   400 Maximum log likelihood 0.97 0.96 -1061298.85    
NC_003098 NC_008086 1000 Steady state sampled 1.71 0.99 -2656860.50    
   2.28 0.99 -2634722.55      
   1000 Maximum log likelihood 1.69 0.98 -2658488.27    
Staphylococcus aureus RF122 vs. Prochlorococcus marinus str. NATL2A
   400 Steady state sampled 0.99 0.90 -1049716.33    
   1.00 0.95 -1045188.54      
   400 Maximum log likelihood 0.99 0.93 -1050316.80    
NC_007335 NC_007622 1000 Steady state sampled 1.92 0.97 -2636903.64    
   2.21 0.97 -2624299.41      
   1000 Maximum log likelihood 1.75 0.97 -2636046.52    
Staphylococcus aureus subsp. aureus COL vs. Methanocaldococcus jannaschii DSM 2661
   400 Steady state sampled 0.96 0.95 -1037936.55    
   1.05 0.89 -1033285.36      
   400 Maximum log likelihood 0.92 0.94 -1037505.67    
NC_000909 NC_002951 1000 Steady state sampled 1.84 0.98     
   2.36 0.99 -2581181.80      
   1000 Maximum log likelihood 1.94 0.98 -2601394.32    
  1. Frag L, Fragment length; LL, Output model log likelihood
  2. The resulting accuracy differences were negligible. Accuracy was also compared in 3-mer models vs. 4-mer models. While 4-mer models slightly outperformed 3-mer models on average, a significant run time increase was observed (not shown). NC _identifiers refer to GenBank accession numbers for genomes listed in each trial.