Skip to main content
Fig. 6 | BMC Bioinformatics

Fig. 6

From: New tools to analyze overlapping coding regions

Fig. 6

a Sequence logo from RNAsampleCDS for all 8,819,712 sequences that code peptides p [resp. q], each of whose amino acids has BLOSUM62 similarity ≥+1 with the corresponding amino acids of the Pol [resp. Gag] 17-mer peptides FFREDLAFPQGKAREFS [resp. FLGKIWPSHKGRPGNFL] in AF033819.3/1631-1682. The PSSM is (exactly) computed by RNAsampleCDS with flag -pssm, and the logo plot was produced using WebLogo [26]. The average pairwise Hamming distance is 10.92±4.32 (length-normalized value of 0.21±0.083), when computed with a random sample of 1000, 5000, and 10,000. b Sequence logo for all 1196 sequences determined by RNAiFold 2.0 to fold into the frameshift stimulating signal (FSS) given by the MFE structure from AF033819.3/1629-1682 and code peptides P,Q, each of whose BLOSUM62 similarity with the Gag,Pol peptides in the overlap region is greater than or equal to +1. The average pairwise Hamming distance is 5.80±1.84 (length-normalized value of 0.11±0.035). c The position-dependent entropy is defined by H i =−p A lnp A −p C lnp C −p G lnp G −p U lnp U for each nucleotide position i=1,…,52. Subfigure (c) shows the position-dependent difference \({H^{a}_{i}} - {H^{b}_{i}}\) in entropies of (a) minus (b). d Position-dependent total variation distance \(\delta (\pi _{1,i},\pi _{2,i}) = 1/2 \cdot \sum _{x \in \{A,C,G,U\}} |\pi _{1,i}(x)-\pi _{2,i}(x)|\) in the 52 nt region of the Gag-Pol overlap in the HIV-1 genome (GenBank AF033819.3/1631-1682) that contains the frameshift stimulating signal (FSS). Here π 1,i resp. π 2,i is the mononucleotide frequency at position i of the PSSM in the left resp. right panel. If total variation distance is zero, then it is suggestive that the coding constraint automatically may already entail the FSS secondary structure constraint

Back to article page