Skip to main content

Table 3 Sensitivity and specificity of competing methods in plasmid experiment

From: ViVaMBC: estimating viral sequence variation in complex populations from illumina deep-sequencing data using model-based clustering

  LoFreq V-Phaser2 ShoRAH
SNP (WT) 1:200 1:100 1:50 1:200 1:100 1:50 1:200 1:100 1:50
A (G) / 1.03 2.41 0.59 1.06 2.37 / / 2.22
G (C) 0.54 1.01 2.38 / 0.94 2.33 / / 2.22
A (C) 0.66 1.03 2.16 / / / 0.44* 0.80* 1.78*
A (G) 0.48 0.91 2.10 0.52 1.04 2.11 0.44* 0.80* 1.78*
A (G) / 0.89 2.05 0.48 / 2.07 / / 1.28
N°false SNPs 3 5 2 19 32 24 4 1 1
Max Freq false SNPs 1.04 1.01 1.02 0.97 1.40 0.72 0.92* 0.5* 0.89
  1. Frequency estimates of the true SNPs after applying the algorithms LoFreq, V-Phaser 2 and ShoRAH on the mixture of plasmids mixed at 1:200, 1:100 and 1:50. Two SNPs should be present in codon 36, while three SNPs are present in codon 155. In case of ShoRAH, the frequency is estimated from three overlapping windows, but often the variant is detected in two out of three windows (denoted with *). None of the methods seem to be able to retrieve all 5 SNPs at 0.5%. The bottom rows of the table report the total number of false SNPs over the whole NS3 region (543 bp long) together with their maximum frequency. The total number of false-positive findings is very low for all methods but their frequencies rise close to 1% which hamper the distinction of true SNPs from this false-positive findings.