Tool | Prefix length | Mismatches | Removed | Time | Memory |
---|
GPU-DupRemoval | 36 | 0 | 2.9 % | 5 m | 6.6 GB |
 | 10 | 1 | 3.6 % | 5 m | 8.2 GB |
 |  | 3 | 4.0 % | 4 m | 8.2 GB |
 | 15 | 1 | 3.5 % | 5 m | 6.9 GB |
 |  | 3 | 3.9 % | 4 m | 6.9 GB |
CD-HIT-DUP | N/A | 0 | 2.9 % | 6 m | 26.9 GB |
 |  | 1 | 3.3 % | 8 m | 35.2 GB |
 |  | 3 | 3.0 % | 11 m | 37.7 GB |
Fulcrum | 36 | 0 | 2.9 % | 35 m | 720 MB |
 | 10 | 1 | 3.6 % | 1h 4 m | 720 MB |
 |  | 3 | 4.2 % | 1h 10 m | 720 MB |
 | 15 | 1 | 3.6 % | 34 m | 1.4 GB |
 |  | 3 | 4.1 % | 36 m | 1.0GB |
FastUniq | N/A | 0 | 2.9 % | 6 m | 10.1 GB |
- The first column reports the name of the tool. The second column reports the prefix length used for clustering the reads for GPU-DupRemoval and Fulcrum. The third column reports the constraint on the allowed number of mismatches. The fourth column reports the percentage of reads that have been removed. The fifth and sixth column report the computing time and the peak of memory required to perform the experiment. Tool settings: i) GPU-DupRemoval -g 0 -D 0 (for identical duplicates) and -g 0 -p <prefix_length > -D <nb_of_mismatches > (for nearly-identical duplicates); ii) CD-HIT-DUP -u 0 -c <nb_of_mismatches >; iii) Fulcrum -b <prefix_length > -s -t p (for clustering) and -q 0 -n 12 -s -t p -c <nb_mismatches >. <prefix_length > was set to 36 for identical duplicates and to 10/15 for nearly-identical duplicates