Skip to main content
Figure 1 | BMC Bioinformatics

Figure 1

From: Clustering of cognate proteins among distinct proteomes derived from multiple links to a single seed sequence

Figure 1

BBH algorithm adopted for Seed Linkage. The algorithm starts by aligning the Seed sequence from the Seed Organism (black diamond) to sequences from all other organisms in the database (circles in Candidate Species), searching for a BBH for each Species. The score BBH1 between the Seed and the highest-scoring sequence (black circle in Sp1) defines the inparalog retrieval score limit in Seed Organism. The inparalogs for the Seed Organism are those sequences whose alignment score between the Seed and the potential inparalog (grey diamonds) exceeds BBH1 (dashed boxes within Seed Organism). The BBH scores (BBH1–4) are used to filter potential inparalogs (grey circles) from the respective Candidate Species (Sp1-4, respectively) when the BBHsj from each species (black circles) are used as secondary queries against proteins from the Candidate Species genome. These thresholds aim to avoid the inclusion of additional spurious sequences in clusters (white diamonds and circles). Inclusion requires a BBH relationship between candidates (grey symbols to be incorporated) and already grouped sequences (black and grey symbols) within the respective Candidate Species.

Back to article page