Skip to main content

Table 2 Simulated bisulfite read experiment

From: Probabilistic alignment leads to improved accuracy and read coverage for bisulfite sequencing data

Evaluation metric

GNUMAP-bs

Novoalign

BSMAP

Bismark

Bismark-bt2

LAST

Overall mapping results:

      

Total reads aligned (%)

156.6M(97.8)

155.8M(97.4)

153.4M(95.9)

149.5M(93.4)

145.4M(90.8)

158.7M(99.2)

Correctly aligned (%)

155.2M(97.0)

154.2M(96.3)

150.2M(93.9)

149.2M(93.2)

145.2M(90.7)

155.1M(96.9)

Incorrectly aligned (%)

1.4M(0.9)

1.7M(1.1)

1.5M(1.0)

0.3M(0.2)

0.2M(0.1)

3.6M(2.3)

With ≥1 sequence variant:

      

Total reads aligned (%)

69.0M(97.8)

66.0M(93.6)

65.3M(92.6)

63.6M(90.2)

59.6M(84.4)

70.3M((99.7)

Correctly aligned (%)

67.7M(96.0)

65.3M(92.6)

63.9M(90.1)

63.3M(89.8)

59.4M(84.1)

66.7M(94.6)

Incorrectly aligned (%)

1.3M(1.8)

0.7M(1.0)

1.4M(2.0)

0.3M(0.4)

0.2M(0.3)

3.5M(5.1)

Predicted methylation:

      

Ave. absolute estimation err.

0.11

0.69

0.22

0.11

0.10

-

Standard err.

0.056

0.066

0.067

0.064

0.062

-

Computational resource:

      

Total compute time (16 CPUs)

39 h 50 m

29 h 25 m

4 h 28 m

46 h 16 m

97 h 26 m

58 h 20 m

Peak memory usage (GB)

44.8

14.5

9.4

5.9

7.9

15.9

Reads per second per CPU

68

92

607

448

26

753

  1. Simulation study of 160 million (M) simulated BSRs generated from the human genome reference sequence. The GNUMAP-bs algorithm was the most sensitive aligner, especially for reads with ≥1 sequence variant (sequencing errors or mutations). The Bismark algorithm had the smallest error rate with 1.2 M fewer erroneously assigned reads than GNUMAP-bs, however GNUMAP-bs correctly aligned 6 to 10 M more reads. The BSMAP algorithm had the fastest total run time, however its sensitivity was less than the sensitivity of the GNUMAP-bs algorithm. LAST mapped nearly all reads with a sensitivity that was comparable to that of GNUMAP-bs, but the mapping error rate for LAST was much higher than it was for GNUMAP-bs.