Skip to main content
Figure 1 | BMC Bioinformatics

Figure 1

From: Clustering metagenomic sequences with interpolated Markov models

Figure 1

Markov models. In a standard wth-order Markov chain model, the next base b in the DNA sequence is assigned a probability that is conditioned on the previous w bases (underlined above for w = 6). w should be chosen so that the data contains a sufficient number of instances of all 4w substrings of length w. An IMM uses all of the Markov models from order 0 to w and computes the probability of the next base by interpolating among them. Our version of the IMM takes this a step further: rather than using the w immediately preceding positions, we use the most "informative" positions (shown above with arrows) of the previous w according to a recursive mutual information calculation.

Back to article page