Removing Noise From Pyrosequenced Amplicons

BMC Bioinformatics

Table 8 Chimera classification accuracies for ChimeraSlayer applied to the three denoised V2 'Uneven' data sets.

Dataset	Uneven1		Uneven2		Uneven3
Classification	Good	Chimeric	Good	Chimeric	Good	Chimeric
Good	86 (91.5%)	8 (8.5%)	72 (93.5%)	5 (6.5%)	72 (96.0%)	3 (4.0%)
Bimera	125 (15.3%)	688 (84.3%)	98 (14.6%)	571 (85.4%)	108 (12.8%)	735 (87.2%)
Trimera	20 (24.7%)	61 (75.3%)	13 (18.3%)	58 (81.7%)	15 (18.3%)	67 (81.7%)
Quadramera	0(0.0%)	1 (100.0%)	(0.0%)	1 (50.0%)	--	--
Unclassified	55 (41.7%)	76 (57.6%)	15 (37.5%)	26 (62.5%)	27 (50.9%)	24 (45.3%)

Each row gives a separate category of denoised sequence according to its true classification as 'Good', 'Bimera', 'Trimera', 'Quadramera' and 'Unclassified'. The columns are then split across data sets and give the number flagged as good or chimeric by ChimeraSlayer at 50% bootstrap. Occasionally a sequence remained unclassified probably because there was no good NAST alignment. Consequently rows do not alway sum to 100%. The Broad Institute 'Gold' 16S rRNA sequences were used as references.

ISSN: 1471-2105