BMC Bioinformatics

Table 2 BLCA accuracy is insenesitve to the inclusion of dissimilar BLAST hits

From: A Bayesian taxonomic classification method for 16S rRNA gene sequences with improved species-level accuracy

Taxonomic levels		Genus		Species
16S region	topPercent Filter	BLCA	MEGAN	BLCA	MEGAN
V2	5%	0.9539 ± 0.0038	0.9531 ± 0.0044	0.7747 ± 0.0150	0.8091 ± 0.0153
	10%	0.9498 ± 0.0019	0.9334 ± 0.0079	0.7594 ± 0.0164	0.7290 ± 0.0114
	20%	0.9487 ± 0.0018	0.8966 ± 0.0080	0.7580 ± 0.0176	0.5983 ± 0.0075
V4	5%	0.9078 ± 0.0078	0.9230 ± 0.0082	0.5597 ± 0.0175	0.6497 ± 0.0058
	10%	0.8982 ± 0.0107	0.8830 ± 0.0115	0.5331 ± 0.0208	0.5238 ± 0.0161
	20%	0.8965 ± 0.0092	0.8016 ± 0.0041	0.5317 ± 0.0189	0.3915 ± 0.0119
V1V3	5%	0.9960 ± 0.0009	0.9778 ± 0.0006	0.9314 ± 0.0058	0.8394 ± 0.0069
	10%	0.9965 ± 0.0012	0.9528 ± 0.004	0.9323 ± 0.0054	0.7071 ± 0.0053
	20%	0.9959 ± 0.0009	0.8609 ± 0.0087	0.9321 ± 0.0053	0.4673 ± 0.0150
V3V5	5%	0.9865 ± 0.0020	0.9550 ± 0.0041	0.8380 ± 0.0064	0.7025 ± 0.0112
	10%	0.9863 ± 0.0011	0.9002 ± 0.0027	0.8335 ± 0.0072	0.5206 ± 0.0108
	20%	0.9863 ± 0.0011	0.7369 ± 0.0094	0.8361 ± 0.0039	0.2880 ± 0.0061
V6V9	5%	0.9933 ± 0.0011	0.9532 ± 0.0050	0.8722 ± 0.0066	0.7258 ± 0.0129
	10%	0.9925 ± 0.0012	0.8939 ± 0.0041	0.8690 ± 0.0012	0.5227 ± 0.0140
	20%	0.9931 ± 0.0017	0.7138 ± 0.0083	0.8701 ± 0.0050	0.2691 ± 0.0255

The parameter topPercent is for keeping only the BLAST hits whose bit scores are within a given percentage of the best BLAST hit. The larger the parameter is, the more dissimilar database hits are included for taxonomic classification for the query sequence. The default value in MEGAN for this parameter is 10%. In our comparisons, we set the value of topPercent to be 5, 10 and 20% for both BLCA and MEGAN, the recommended range by the original MEGAN publication, to compare the performance of BLCA and MEGAN under different stringencies of retaining BLAST hits. Each table entry shows the average and standard deviation of the F-scores, based on the confidence score threshold of 0.8, for each tested software at the corresponding 16S region. The F-scores of BLCA are much less sensitive to the value of topPercent when compared to MEGAN

Back to article page

ISSN: 1471-2105

Contact us

General enquiries: journalsubmissions@springernature.com