Skip to main content
Figure 1 | BMC Bioinformatics

Figure 1

From: Computational identification of strain-, species- and genus-specific proteins

Figure 1

Flow chart of the method used to detect strain-, species-, or genus-specific proteins. Proteins were designated as not unique if (a) The protein is part of a merged entry (containing identical proteins from different organisms) and one of the source organisms is considered non-self; (b) The query protein hits a non-self subject with E < 0.001; (c) The non-self best hit, when itself used as a query (in a reverse BLAST), retrieves the initial query as an organism-specific best hit (that is, they are reciprocal best hits, and thus potential orthologs); and (d) The non-self best hit fails to retrieve the initial query as its potential ortholog — but does indeed retrieve the initial query — and manual inspection reveals homology. Sequences that could not retrieve themselves with an expect value of 1e-14 or better in the forward BLAST using the BLOSUM62 matrix were retested using the PAM30 matrix.

Back to article page