Skip to main content
Figure 1 | BMC Bioinformatics

Figure 1

From: ViVaMBC: estimating viral sequence variation in complex populations from illumina deep-sequencing data using model-based clustering

Figure 1

Influence of coverage depth on the estimation of τ j . Datasets with lower coverages are generated by random sampling a fraction (f = 0.1, 0.2, …, 0.8,0.9) of the reads from the original dataset. Ten datasets were generated for each fraction f resulting in 90 datasets with average coverages ranging between 6,463 and 58,185. The reported variants for all re-sampled datasets were plotted and colored according to the discovered codon. The green dots indicate the true variant and all others are false-positive findings. The average frequency of the true variant (averaged over the ten random samples) is indicated with triangles. The dotted line is the true frequency as estimated from the original dataset. Lowering the coverage increases the bias, the variance of the estimate and the number of false-positive findings.

Back to article page