Skip to main content
Figure 1 | BMC Bioinformatics

Figure 1

From: Fast lossless compression via cascading Bloom filters

Figure 1

The encoding process. Step 1 separates the unique reads set R' from the repeated reads set FN. In step 2 unique reads (R') are hashed into a BF B 1 and the rest assigned to a set FN. In steps 3-4 all read-length sequences of the reference genome G are queried and reads accepted by B1 that are not in R' are added to FP1. Steps 5-10 show subsequent encoding via a BF cascade. False positives relative to each BF are input to the next BF. Each BF is then queried by using the set loaded into the last BF in the cascade. In step 11 additional compression is perfomed on the resulting BFs and sets. Orange arrows indicate assignments. Purple arrows marked with Q(.) indicate BF queries with sets denoted in parenthesis. Blue arrows indicate BF loading.

Back to article page