Phylogenetic profiles of two bacterial-specific proteins A and B. The presence or absence of homolog in a genome is represented by white and black squares, respectively. Bit differences in the corresponding positions of the profiles are shown using arrows. (a) The actual profiles of A and B are similar, with only one bit difference. (b) A shuffling strategy that shuffles all the entries in the profile will result in at most 7 bit differences. (c) A restrictive shuffling mechanism (which takes into account the lineage-specificity of the proteins under consideration) that shuffles only the entries corresponding to bacterial genomes will result in at most 5 bit differences. For lineage-specific proteins, unrestricted shuffling process can artificially reduce the similarity scores of the shuffled profiles, thereby underestimating the probability of a random protein pair having a certain similarity score.