Skip to main content

Table 12 SNP calling on the H. sapiens dataset with and without compression

From: QualComp: a new lossy compressor for quality scores based on rate distortion theory

    

One cluster

   

R

MSE

T.P.

F.P

 F.N.

Selectivity (%)

Sensitivity (%)

Size (MB)

0

75.64

54945

11560

 5482

82.62

90.93

0.027

0.20

13.95

58806

5952

 1621

90.81

97.32

27.37

0.25

12.55

58881

5707

 1546

91.16

97.44

34.20

0.50

9.18

59078

5022

 1349

92.17

97.77

68.38

1.00

6.53

59349

4541

 1078

92.89

98.22

136.74

2.00

3.50

59628

3814

 799

93.99

98.68

273.45

    

Two clusters

   

R

MSE

T.P.

F.P

 F.N.

Selectivity (%)

Sensitivity (%)

Size (MB)

0

25.21

51007

5010

 9420

91.05

84.41

0.054

0.20

9.09

58955

4949

 1472

92.25

97.56

27.39

0.25

8.53

59002

4951

 1425

92.25

97.64

34.23

0.50

7.17

59188

4784

 1239

92.52

97.94

68.41

1.00

5.42

59400

4559

 1027

92.87

98.30

136.76

2.00

3.02

59601

3718

 826

94.12

98.63

273.48

    

Three clusters

   

R

MSE

T.P.

F.P.

 F.N.

Selectivity (%)

Sensitivity (%)

Size (MB)

0

17.32

52922

4686

7505

91.87

87.58

0.082

0.20

7.80

58913

4823

 1514

92.43

97.49

27.42

0.25

7.26

58977

4766

 1450

92.52

97.60

34.26

0.50

5.90

59111

4411

 1316

93.06

97.82

68.44

1.00

4.16

59247

4041

 1180

93.61

98.05

136.79

2.00

1.99

59589

3262

 838

94.81

98.61

273.51

  1. We compare the SNPs detected by Samtools with the original FASTQ file and those obtained with the compressed files, using QualComp with one, two and three clusters and different rates. For more details see Table 11.