Skip to main content
Figure 1 | BMC Bioinformatics

Figure 1

From: Addressing the unmet need for visualizing conditional random fields in biological data

Figure 1

Typical biological “sequence” data containing both positional and dependency information. Sequences from Archaeal tRNA genes (A) and several canonical models and representations of this family of sequences. (B) Consensus, which simply represents the family in terms of the most popular symbol found in each column. (C) shows a Position Specific Scoring Matrix (PSSM), in this case truncated to single digit precision, which encodes the marginal distribution of each symbol in each column (D) shows a Sequence Logo, which convolves the marginal weights from a PSSM, with an information-theoretic measure of the information available in each column, under an assumption of column-column independence. (E) shows a sensory representation of the PSSM which provides some benefits for visually evaluating whether a candidate sequence fits the residue distribution of the training data. None of these representations provide any information regarding dependencies between either their columns, or between specific residues in specific columns. However, (E) provides a graphical starting point for an improved representation that can convey this information.

Back to article page