Fig. 2From: Removing duplicate reads using graphics processing unitsComparison. The first read of each cluster is taken as a seed and its suffix is compared with that of the other sequences in the cluster. Sequences that differ from the seed for a number of mismatches lower than a user-defined threshold are considered duplicates of the seed. Each set of duplicates is removed from the cluster and are represented with a consensus sequence. The process is iterated until the cluster is empty. Image from [16] used under the terms of the Creative Commons Attribution License (CC BY)Back to article page