Figure 1From: Fast lossless compression via cascading Bloom filtersThe encoding process. Step 1 separates the unique reads set R' from the repeated reads set FN. In step 2 unique reads (R') are hashed into a BF B 1 and the rest assigned to a set FN. In steps 3-4 all read-length sequences of the reference genome G are queried and reads accepted by B1 that are not in R' are added to FP1. Steps 5-10 show subsequent encoding via a BF cascade. False positives relative to each BF are input to the next BF. Each BF is then queried by using the set loaded into the last BF in the cascade. In step 11 additional compression is perfomed on the resulting BFs and sets. Orange arrows indicate assignments. Purple arrows marked with Q(.) indicate BF queries with sets denoted in parenthesis. Blue arrows indicate BF loading.Back to article page