Skip to main content
Figure 2 | BMC Bioinformatics

Figure 2

From: Subfamily specific conservation profiles for proteins based on n-gram patterns

Figure 2

Initial steps in the NPLA algorithm: (a) For each shared pair of zero-offset NP{4,2} patterns between a query sequence and a target sequence, the non-wildcard positions in a collection sequence equal in length to the query sequence are set to 1. (b) The target sequences are divided into 20 sets (bins) based on the similarity of their NP{4,2} pattern content. (c) Raw conservation profiles are generated for each similarity bin by summing over the collection sequences associated with the bin and dividing by the number of members in the bin.

Back to article page