Skip to main content
Fig. 3 | BMC Bioinformatics

Fig. 3

From: Frugal alignment-free identification of FLT3-internal tandem duplications with FiLT3r

Fig. 3

Computing the duplication length. Three examples are shown using the same reference as in  Fig. 1. In the first example (left) we also have the same read as in Fig. 1. The break \((j_1,j_2)=(3,4)\) is due to a duplication. Its length will be computed with \(t_R[j_1-1]-t_R[j_2+1]+j_2-j_1+2=5-4+4-3+2=4\). In the second example (right) a single nucleotide is modified leading to a longer break (1, 4). Thus \(j'_1\) decreased by 2 compared to the first example. However, in the meantime \(t_{R'}[j'_1-1]=3\) also decreased by 2 compared to the first example (where \(t_R[j_1-1]=5\)). Thus, the duplication length is identical: \(3-4+4-1+2=4\). This is an example where our algorithm can detect the duplication even when it contains a substitution. The process is similar with an indel. In the third example (bottom) the duplication starts with the same letter (C) as the letter that follows the duplication, in position 10. The consequence is a shorter break as the k-mer in position 3, that overlaps the duplication breakpoint exists in the reference. However, the duplication length is correctly computed as \(t_R[j''_1-1]-t_R[j''_2+1]+j''_2-j''_1+2=6-3+2=5\)

Back to article page