Skip to main content
Figure 1 | BMC Bioinformatics

Figure 1

From: A configuration space of homologous proteins conserving mutual information and allowing a phylogeny inference based on pair-wise Z-score probabilities

Figure 1

Geometric and probabilistic representation of the configuration space of homologous proteins (CSHP). For any sequence a taken as a reference (a ref ), one can build a configuration space (CSHP, a ref ) where all sequences that are homologous to a can be set. When two sequences a and b are aligned with a score s(a,b), then b is positioned in (CSHP, a ref ) and a in (CSHP, b ref ). The sequence alignment length determines the number of configuration dimensions; pair-wise amino acid scores determine the unique solution for its positioning. The q-dissimilarity (q = e-s) defines a proximity between sequences allowing a geometric representation (CSHP, q). Remarkable properties are i) the conservation of mutual information, [I(a;b) = I(b;a) ⇒ q(a,b) = q(b,a)], between (CSHP, a ref ) and (CSHP, b ref ), ii) a probabilistic representation of homologies based on q-dissimilarities by Venn diagrams (A and B) and iii) the assignment of a topology relying on protein evolution assumptions. Evolutionary paths for a and b lineages, sharing an unknown ancestor u, have a probabilistic expression, bounded above (see text), supporting a phylogenetic topology (TULIP trees).

Back to article page