Skip to main content

Table 1 Comparison of the classification accuracies using the simulated dataset

From: A Bayesian taxonomic classification method for 16S rRNA gene sequences with improved species-level accuracy

CST = 0.8

V2

V4

V1V3

V3V5

V6V9

Species

BLCA

0.7594 ± 0.0164*

0.5331 ± 0.0208

0.9323 ± 0.0054*

0.8335 ± 0.0072*

0.8690 ± 0.0012*

Kraken

0.7275 ± 0.0054

0.5326 ± 0.0181

0.8672 ± 0.0072

0.7542 ± 0.0087

0.7572 ± 0.0056

MEGAN

0.7290 ± 0.0114

0.5238 ± 0.0161

0.7071 ± 0.0053

0.5206 ± 0.0108

0.5227 ± 0.0140

RDP

0.6102 ± 0.0042

0.3928 ± 0.0292

0.8549 ± 0.0199

0.7307 ± 0.0203

0.7823 ± 0.0124

SPINGO

0.5700 ± 0.0187

0.3910 ± 0.0106

0.7907 ± 0.0061

0.6900 ± 0.0071

0.7318 ± 0.0116

Genus

BLCA

0.9498 ± 0.0019*

0.8982 ± 0.0107*

0.9965 ± 0.0012*

0.9863 ± 0.0011*

0.9925 ± 0.0012*

Kraken

0.9072 ± 0.0066

0.8612 ± 0.0189

0.9691 ± 0.0051

0.9463 ± 0.0006

0.9437 ± 0.0034

MEGAN

0.9334 ± 0.0079

0.8830 ± 0.0115

0.9528 ± 0.0040

0.9002 ± 0.0027

0.8939 ± 0.0041

RDP

0.8768 ± 0.0065

0.8067 ± 0.0139

0.9629 ± 0.0072

0.9562 ± 0.0065

0.9657 ± 0.0042

SPINGO

0.8481 ± 0.0002

0.7726 ± 0.0077

0.9333 ± 0.0057

0.9192 ± 0.0034

0.9238 ± 0.0067

Family

BLCA

0.9791 ± 0.0009*

0.9787 ± 0.0018*

0.9984 ± 0.0019*

0.9975 ± 0.0019*

0.9970 ± 0.0014*

Kraken

0.9594 ± 0.0038

0.9480 ± 0.0028

0.9882 ± 0.0021

0.9850 ± 0.0033

0.9799 ± 0.0032

MEGAN

0.9495 ± 0.0089

0.9413 ± 0.0015

0.9517 ± 0.0032

0.9397 ± 0.0044

0.9447 ± 0.0034

RDP

0.9461 ± 0.0093

0.9295 ± 0.0062

0.9818 ± 0.0007

0.9806 ± 0.0054

0.9855 ± 0.0013

SPINGO

NA

NA

NA

NA

NA

CST = 0.5

V2

V4

V1V3

V3V5

V6V9

Species

BLCA

0.8485 ± 0.0128*

0.6813 ± 0.0115*

0.9629 ± 0.0077*

0.9050 ± 0.0034*

0.9315 ± 0.0045*

Kraken

0.7275 ± 0.0054

0.5326 ± 0.0181

0.8672 ± 0.0072

0.7542 ± 0.0087

0.7572 ± 0.0056

MEGAN

0.7290 ± 0.0114

0.5238 ± 0.0161

0.7071 ± 0.0053

0.5206 ± 0.0108

0.5227 ± 0.0140

RDP

0.7526 ± 0.0107

0.5692 ± 0.0194

0.8997 ± 0.0144

0.8221 ± 0.0105

0.8621 ± 0.0094

SPINGO

0.6570 ± 0.0124

0.5008 ± 0.0114

0.8256 ± 0.0038

0.7497 ± 0.0041

0.7805 ± 0.0021

Genus

BLCA

0.9722 ± 0.0028*

0.9467 ± 0.0031*

0.9985 ± 0.0019*

0.9947 ± 0.0013*

0.9972 ± 0.0002*

Kraken

0.9072 ± 0.0066

0.8612 ± 0.0189

0.9691 ± 0.0051

0.9463 ± 0.0006

0.9437 ± 0.0034

MEGAN

0.9334 ± 0.0079

0.8830 ± 0.0115

0.9528 ± 0.0040

0.9002 ± 0.0027

0.8939 ± 0.0041

RDP

0.9319 ± 0.0044

0.8960 ± 0.0086

0.9710 ± 0.0049

0.9693 ± 0.0046

0.9729 ± 0.0003

SPINGO

0.8807 ± 0.0034

0.8354 ± 0.0041

0.9400 ± 0.0030

0.9287 ± 0.0024

0.9317 ± 0.0083

Family

BLCA

0.9870 ± 0.0013*

0.9856 ± 0.0035*

0.9987 ± 0.0021*

0.9991 ± 0.0012*

0.9984 ± 0.0019*

Kraken

0.9594 ± 0.0038

0.9480 ± 0.0028

0.9882 ± 0.0021

0.9850 ± 0.0033

0.9799 ± 0.0032

MEGAN

0.9495 ± 0.0089

0.9413 ± 0.0015

0.9517 ± 0.0032

0.9397 ± 0.0044

0.9447 ± 0.0034

RDP

0.9696 ± 0.0040

0.9674 ± 0.0015

0.9836 ± 0.0017

0.9830 ± 0.0033

0.9868 ± 0.0004

SPINGO

NA

NA

NA

NA

NA

  1. Each entry in the table shows the average and standard deviation of the F-scores for a particular classifier (i.e., rows) at a specific 16S region (i.e., columns) based on three random sets of 1000 test sequences. Two confidence score thresholds (CST), 0.8 and 0.5, were applied for BLCA, RDP Classifier, and SPINGO as described in the main text. The *indicates that the F-scores of BLCA are significantly higher than those of other software, based on a one-tailed paired t-test with a p-value less than 0.05. Similar statistical significance was also obtained using the one-tailed Wilcoxon signed-rank test. Note that the SPINGO program does not produce family-level classification. In addition, Kraken and MEGAN do not provide any probabilistic-based parameters for evaluating the assigned taxa, thus we used their default taxonomic assignments for comparison