Skip to main content
Figure 3 | BMC Bioinformatics

Figure 3

From: SynBlast: Assisting the analysis of conserved synteny information

Figure 3

Evaluation of syntenic blocks. Panel (A) summarizes the mappings from query to target, panel (B) elaborates on particular cases. The query region (panel (A), middle) contains a sequence of syntenic query loci (green), each representing one or more possibly overlapping query proteins (i). Each candidate target region in the genomes of interest (panel (A) above and below the query locus), is identified by a set of blast hits, HSPs, (yellow). For each region, the following steps are performed: First, the set of query-specific HSPs is chained (ii), resulting in one or more HSP chains that represent approximate protein models (small boxes). Filtering rules are applied that exclude individual HSPs from a chain for one of the following four reasons: (1) if the resulting chain exceeds the prescribed size limit for a locus (B1) [default: twice the length of the query locus]; (2) if it is inconsistent with a co-linear ordering of other HSPs in the chain (B2); (3) if it overlaps with another query interval by more than a specified threshold (B3) [default: 30aa]; and (4) if it lies on the opposite strand (B4). Chains of HSPs are excluded if they score below a threshold bit-score [default: 50] after filtering (B5). The retained HSP chains are grouped (iii) into target loci (big open boxes) that contain all HSP chains (irrespective of their orientation) with overlapping target intervals. For each target locus, only the highest scoring chain for each query protein is kept (B6). This results in a sequence of non-overlapping target loci (recall that one locus might represent one or more proteins) that can be aligned (iv) with the sequence of query loci in a gene order alignment (gray shading, optimal assignments are shown by darker shading). The score of this alignment is then used to rank the region relative to other syntenic target regions.

Back to article page