Skip to main content

Table 12 SNP calling on the H. sapiens dataset with and without compression

From: QualComp: a new lossy compressor for quality scores based on rate distortion theory

     One cluster    
R MSE T.P. F.P  F.N. Selectivity (%) Sensitivity (%) Size (MB)
0 75.64 54945 11560  5482 82.62 90.93 0.027
0.20 13.95 58806 5952  1621 90.81 97.32 27.37
0.25 12.55 58881 5707  1546 91.16 97.44 34.20
0.50 9.18 59078 5022  1349 92.17 97.77 68.38
1.00 6.53 59349 4541  1078 92.89 98.22 136.74
2.00 3.50 59628 3814  799 93.99 98.68 273.45
     Two clusters    
R MSE T.P. F.P  F.N. Selectivity (%) Sensitivity (%) Size (MB)
0 25.21 51007 5010  9420 91.05 84.41 0.054
0.20 9.09 58955 4949  1472 92.25 97.56 27.39
0.25 8.53 59002 4951  1425 92.25 97.64 34.23
0.50 7.17 59188 4784  1239 92.52 97.94 68.41
1.00 5.42 59400 4559  1027 92.87 98.30 136.76
2.00 3.02 59601 3718  826 94.12 98.63 273.48
     Three clusters    
R MSE T.P. F.P.  F.N. Selectivity (%) Sensitivity (%) Size (MB)
0 17.32 52922 4686 7505 91.87 87.58 0.082
0.20 7.80 58913 4823  1514 92.43 97.49 27.42
0.25 7.26 58977 4766  1450 92.52 97.60 34.26
0.50 5.90 59111 4411  1316 93.06 97.82 68.44
1.00 4.16 59247 4041  1180 93.61 98.05 136.79
2.00 1.99 59589 3262  838 94.81 98.61 273.51
  1. We compare the SNPs detected by Samtools with the original FASTQ file and those obtained with the compressed files, using QualComp with one, two and three clusters and different rates. For more details see Table 11.