From: A model-based information sharing protocol for profile Hidden Markov Models used for HIV-1 recombination detection

Examples of the calculation of the emission probabilities. Simplified example of position- and subtype-wise nucleotide frequencies of HIV and the emission probabilities derived from them using the presented information sharing protocol. For three sites the subtype-wise nucleotide frequencies for the four subtypes A-D are given on the left side of the table. Below them, the emission probabilities estimated based only on the frequencies of the respective subtype are shown, using pseudocounts α =(0.1,0.1). The colors indicate which subtypes should be jointly modeled in order to get the most likely source combination. The nucleotide frequencies of the sources (i.e. the aggregated frequencies of the subtypes belonging to it) as well as the emission probabilities estimated based on these frequencies are given on the right side of the table (using the same α ). For the sake of simplicity, we assume only the nucleotides G and T occur. Apart from this simplification and the restriction to four subtypes, this example is taken from actual HIV-1 sequences.

