Skip to main content
Fig. 3 | BMC Bioinformatics

Fig. 3

From: Fast batch searching for protein homology based on compression and clustering

Fig. 3

Illustration of compression process. This is an illustration of compression process. Seq a and seq b are sequences taken from the original sequence set which include the same key ‘SERGK’ with their subsequences similarity being more than 80%. Seq b is compressed by removing the similar counterparts. To keep the completeness of seq b, a script is employed to record the differences between seq a and the compressed seq b, where ‘a, 15, 43’ records the site of the removed segment, ‘r6L, r8A, r3V, i5D’ records the small differences compared with the representative sequence

Back to article page