Skip to main content
Fig. 2 | BMC Bioinformatics

Fig. 2

From: Removing duplicate reads using graphics processing units

Fig. 2

Comparison. The first read of each cluster is taken as a seed and its suffix is compared with that of the other sequences in the cluster. Sequences that differ from the seed for a number of mismatches lower than a user-defined threshold are considered duplicates of the seed. Each set of duplicates is removed from the cluster and are represented with a consensus sequence. The process is iterated until the cluster is empty. Image from [16] used under the terms of the Creative Commons Attribution License (CC BY)

Back to article page