Skip to main content

Table 5 Performance comparison on the SRR005718 library among GPU-DupRemoval, FastUniq, CD-HIT-DUP, and Fulcrum. The library consists of 32.160.546 of 36 bp paired-end reads generated with an Illumina platform

From: Removing duplicate reads using graphics processing units

Tool

Prefix length

Mismatches

Removed

Time

Memory

GPU-DupRemoval

36

0

2.9 %

5 m

6.6 GB

 

10

1

3.6 %

5 m

8.2 GB

  

3

4.0 %

4 m

8.2 GB

 

15

1

3.5 %

5 m

6.9 GB

  

3

3.9 %

4 m

6.9 GB

CD-HIT-DUP

N/A

0

2.9 %

6 m

26.9 GB

  

1

3.3 %

8 m

35.2 GB

  

3

3.0 %

11 m

37.7 GB

Fulcrum

36

0

2.9 %

35 m

720 MB

 

10

1

3.6 %

1h 4 m

720 MB

  

3

4.2 %

1h 10 m

720 MB

 

15

1

3.6 %

34 m

1.4 GB

  

3

4.1 %

36 m

1.0GB

FastUniq

N/A

0

2.9 %

6 m

10.1 GB

  1. The first column reports the name of the tool. The second column reports the prefix length used for clustering the reads for GPU-DupRemoval and Fulcrum. The third column reports the constraint on the allowed number of mismatches. The fourth column reports the percentage of reads that have been removed. The fifth and sixth column report the computing time and the peak of memory required to perform the experiment. Tool settings: i) GPU-DupRemoval -g 0 -D 0 (for identical duplicates) and -g 0 -p <prefix_length > -D <nb_of_mismatches > (for nearly-identical duplicates); ii) CD-HIT-DUP -u 0 -c <nb_of_mismatches >; iii) Fulcrum -b <prefix_length > -s -t p (for clustering) and -q 0 -n 12 -s -t p -c <nb_mismatches >. <prefix_length > was set to 36 for identical duplicates and to 10/15 for nearly-identical duplicates