range | ML μ | PP μ | ML σ | PP σ | ML FC | PP FC | ML # | PP # |
---|
0.0-0.1 | - | - | - | - | - | - | 0 | 0 |
0.1-0.2 | 3.57 | 3.78 | 3.09 | 3.27 | 0.07 | 0.03 | 4149 | 2312 |
0.2-0.3 | 2.97 | 3.19 | 3.04 | 3.06 | 0.16 | 0.11 | 15123 | 9018 |
0.3-0.4 | 2.39 | 2.76 | 3.00 | 3.07 | 0.26 | 0.17 | 22696 | 18373 |
0.4-0.5 | 2.25 | 2.29 | 3.11 | 2.98 | 0.32 | 0.24 | 20120 | 23022 |
0.5-0.6 | 2.14 | 2.11 | 3.09 | 3.01 | 0.36 | 0.32 | 17228 | 20090 |
0.6-0.7 | 1.94 | 1.95 | 3.04 | 2.99 | 0.42 | 0.38 | 14113 | 16223 |
0.7-0.8 | 1.86 | 1.85 | 3.05 | 3.01 | 0.47 | 0.44 | 13527 | 14879 |
0.8-0.9 | 1.62 | 1.65 | 2.97 | 2.97 | 0.55 | 0.52 | 14850 | 15747 |
0.9-1.0 | 0.32 | 0.32 | 1.54 | 1.53 | 0.92 | 0.92 | 163815 | 165957 |
- Error analysis for the COG simulation with the error metric described in the text. As in Figure 6, simulated reads had a normally-distributed length with a mean of 85 amino acids, and a standard deviation of 20. This table pools the results, and shows mean (μ) and standard deviation (σ) of the error, the fraction placed correctly (FC), and the number of reads placed for pplacer run in maximum likelihood (ML) and posterior probability (PP) modes. For example, the "ML" columns in the row labeled 0.4-0.5 shows error statistics for all of the reads in the simulation that had likelihood weight ratio between 0.4 and 0.5: there were 20120 such reads of which 32% were placed correctly, and the corresponding error mean and standard deviation of about 2.25 and 2.29, respectively. This table demonstrates the effectiveness of the confidence scores- as the confidence scores increase, the error decreases. We note that the ML and PP methods have very comparable performance for this length of read, and thus the quickly-calculated ML weight ratio can act as a proxy for the more statistically rigorous posterior probability calculation.