Pos. | Sub./Src. | A | B | C | D | 1 | 2 |
3
|
---|
| Nucl. |
G
|
T
|
G
|
T
|
G
|
T
|
G
|
T
|
G
|
T
|
G
|
T
|
G
|
T
|
1 | freq | 89 | 0 | 360 | 0 | 393 | 0 | 3 | 0 | 846 | 0 | | | | |
| p | 0.9989 | 0.0011 | 0.9997 | 0.0003 | 0.9997 | 0.0003 | 0.969 | 0.031 | 0.9999 | 0.0001 | | | | |
2 | freq | 65 | 24 |
355
|
5
|
382
|
11
|
3
|
0
| 65 | 24 |
740
|
19
| | |
| p | 0.73 | 0.27 | 0.986 | 0.014 | 0.972 | 0.028 | 0.969 | 0.031 | 0.73 | 0.27 | 0.975 | 0.025 | | |
3 | freq | 30 | 59 |
325
|
35
|
364
|
29
|
0
|
3
| 30 | 59 |
689
|
64
|
0
|
3
|
| p | 0.34 | 0.66 | 0.903 | 0.097 | 0.926 | 0.074 | 0.0031 | 0.969 | 0.34 | 0.66 | 0.915 | 0.085 | 0.0031 | 0.969 |
- Simplified example of position- and subtype-wise nucleotide frequencies of HIV. For three sites the subtype-wise nucleotide frequencies for subtypes A, B, C, and D are given on the left side of the table. Below them the emission probabilities estimated on the basis of only on the frequencies of the respective subtypes (using
) are shown. The different typefaces (regular, bold, italic) indicate which subtypes should be jointly modelled (i.e. belong to the same source). On the right-hand side of the table, the nucleotide frequencies of the sources (i.e. the aggregated frequencies of the subtypes belonging to it) and the emission probabilities estimated on the basis of these frequencies are given (using the same
). For the sake of simplicity, only the nucleotides G and T are assumed to exist. Apart from this simplification and the restriction to 4 subtypes, the example is taken from actual HIV-1 sequences.