Concept/result | Description | Main location |
---|---|---|
Ancestry index | An ancestry index is assigned to each site. Sharing of an ancestry index among sites indicates the sites’ mutual homology. As a fringe benefit, the indices enable the mutation rates to vary across regions (or sites) beyond the mere dependence on the residue state of the sequence. |
Section R2 (1st and 2nd paragraphs), Fig. 2 |
Operator representation of mutations | This enables the intuitively clear and yet mathematically precise description of mutations, especially insertions/deletions, on sequence states. This is a core tool in our ab initio theoretical formulation of the genuine stochastic evolutionary model. |
Section R2 (3rd paragraph), Fig. 3 |
Rate operator |
An operator version of the rate matrix, which specifies the rates of the instantaneous transitions between the states in our evolutionary model. In other words, the rate operator describes the instantaneous stochastic effects of single mutations on a given sequence state. |
Section R3, Eqs. (R3.1-R3.9) (full mutational model), Eqs. ( R3.2 , R3.6 , R3.11 - R3.15 ) (indel model) |
Finite-time transition operator | An operator version of the finite-time transition matrix, each element of which gives the probability of transition from a state to another after a finite time-lapse. This results from the cumulative effects of the rate operator during a finite time-interval. |
Section R3, Eq. ( R3.17 ), Eq. (R3.18) |
Defining equations (differential) | 1st-order time differential equations (forward and backward) that define our indel evolutionary model. They are operator versions of the standard defining equations of a continuous-time Markov model. |
Section R3, Eqs. (R3.19,R3.21) (forward), Eqs. (R3.20,R3.21) (backward) |
Defining equations (integral) | Two integral equations (forward and backward) that are equivalent to the aforementioned differential equations defining our indel evolutionary model. They play an essential role when deriving the perturbation expansion of the finite-time transition operator. |
Section R4, Eq. ( R4.4 ) (forward), Eq. ( R4.5 ) (backward) |
Perturbation expansion (transition operator) | The perturbation expansion of the finite-time transition operator. It was derived in an intuitively clear yet mathematically precise manner, by using the aforementioned defining integral equations. | Section R4, Eqs. ( R4.6 , R4.7 ) |
Perturbation expansion (ab initio PWA probability) | The perturbation expansion of the ab initio probability of a given PWA, conditioned on the ancestral sequence state, under a given model setting. | Section R4, Eq. (R4.8) or Eq. (R4.9) |
Binary equivalence relation | An equivalence relation between the products of two indel operators each. The relations play key roles when defining LHS equivalence classes. | Section R5, Eqs. ( R5.2a - R5.2d ) |
Local-history-set (LHS) equivalence class | An equivalence class consisting of global indel histories that share all local history components. The classes play an essential role when proving the factorability of a given PWA probability. |
Section R5, below Eq. ( R5.4 ), (e.g., Fig. 5) |
Factorability ( ab initio PWA probability) | We proved that, under conditions (i) and (ii) (below Eq. (R6.4)), the ab initio probability of a given PWA is factorable into the product of an overall factor and contributions from local PWAs. |
Section R6, Eqs. (
R6.7
,
R6.8
), (see also Eqs. (R6.2,R6.3,R6.4)) |
Perturbation expansion (ab initio MSA probability) | The “perturbation expansion” of the ab initio probability of a given MSA, under a given model setting including a given phylogenetic tree. | Section R7, Eqs. (R7.2,R7.3,R7.4) |
Factorability ( ab initio MSA probability) | We proved that, under conditions (i), (ii) (below Eq. (R6.4) and (iii) (Eq. ( R7.8 )), the ab initio probability of a given MSA is factorable into the product of an overall factor and contributions from local MSAs. |
Section R7, Eq. ( R7.9 ) |
Totally space-homogeneous model | Such a model gives factorable PWA probabilities, because the exit rate is an affine function of the sequence length (regardless of whether indel rates are time-dependent or not). The indel model of Dawg [26] and the “long indel” model [21] belong to this class. | Subsection R8-1, Eqs. (R8-1.1,R8-1.2), Eqs. ( R8-1.3 , R8-1.4 ) |
Equivalence (with caveat) of the “chop-zone” method and our ab initio method | We showed that the “chop-zone” method in [21], adapted to calculate the probability of a given LHS equivalence class, is equivalent to our ab initio method, at least if the indel model is spatiotemporally homogeneous. | Subsection R8-1, Supplementary appendix SA-3 |
Model with simple insertion rate variation | If the deletion rates are space-homogeneous and the insertion rates depend only on the insertions’ flanking sites, the PWA probabilities are still factorable. | Subsection R8-1, Eq. (R8-1.5) |
Space-homogenous model flanked by essential sites | This kind of model is a simplest example of the indel model whose ab initio PWA probabilities are non-factorable. |
Subsection R8-2, Eqs. ( R8-2.1 , R8-2.3 ) |
Degree of non-factorability | The “difference of exit-rate differences” (Eq. (R8-2.4)) could measure the “degree of non-factorability.” | Subsection R8-2, Eq. (R8-2.4) |
Space-heterogeneous model with factorable PWA probability | We found that a class of indel models with rate-heterogeneity across regions (Eqs. (R8-3.1,R8-3.2)) have partially factorable PWA probabilities. | Subsection R8-3, Eqs. ( R8-3.1 , R8-3.2 ), Eqs. (R8-3.3,R8-3.4,R8-3.5), Figure S3 |