The ecology and evolution of influenza viruses, especially in human and avian populations, have received considerable attention over the last decade [1–3]. It is increasingly recognised that because of the genetic diversity of these RNA viruses [4, 5] and their high mutation rate [1], transmission dynamics and evolution need to be considered simultaneously. The evolutionary dynamics of influenza viruses are complicated the frequent occurrence of reassortment events [6, 7], whereby two virus strains co-infecting the same individual exchange genetic material, resulting in a third strain. Despite its demonstrated frequent occurrence [1, 6], reassortment remains largely absent from theoretical studies of influenza evolution. In this respect, it is important to develop an appropriate theoretical framework to explore its evolutionary impact.

The most commonly used approach to analyse influenza dynamics has been mathematical models where state space is expanded to take into account multiple virus strains [8, 2, 9, 3, 10]. The overlap between epidemiological and evolutionary time scales [1], the observed immune drift in human populations [2, 3], and the high mutation rate [1] make it essential to integrate a large strain space into models. Consequently, the classic modelling approaches, based on the familiar *SIR* framework [11–13], are not readily amenable for this purpose because the resulting state space increases exponentially with the number of strains and, consequently, becomes too cumbersome for meaningful analysis. One possibility, recently suggested [14, 3], is not to track co-infections, yielding a model whose number of state variables increase linearly with the number of strains. However, these models by necessity overlook the contribution of reassortment events (but see [15]).

It is possible to fill this void by developing an Individual-Based Model (IBM, see [16, 17]). The rationale of this kind of modelling is to explore individual heterogeneity, allowing us to track infection history. Extensively used in ecology [18] and more recently in epidemiology [9], the main goal of these models have been to focus on very detailed and specific scenarios, with the goal of making quantitative predictions especially about alternative infection control decisions [19, 20, 9]. IBMs have also been shown to be useful in theoretical studies [21–23].

Here, we develop a stochastic IBM to simultaneously address the ecology and evolution of avian influenza viruses. We first describe the current paradigms of influenza mathematical modelling, which drives the structure of our computational framework. After detailing our IBM, we demonstrate that under analogous conditions, our model precisely recaptures the output of classic models. This validation step will allow the future use of our IBM to rely on the background of mathematical epidemiology [11–13]. Finally, we conclude that our model improves the current mathematical models and can be exploited to study evolutionary processes previously intractable, giving the opportunity to address all the components of influenza evolution.

### 1.1 Transmission process and mean field theory

The goal our of IBM is to extend the capabilities of the mathematical models of Influenza viruses. Here, we describe the current paradigms about the Influenza transmission and how they are modelled using the classic *SIR* framework. These models underlie the IBM design presented during the next section.

In humans populations, a direct transmission is assumed between individuals as usual when airborne propagation is documented [24]. The situation is more complex in birds. Generally, direct transmission is assumed to encapsulate both fecal/oral and airborne transmission because they are occur on the same time scale and both rely on the proximity of susceptible and infectious individuals [25]. There is increasing evidence for indirect influenza transmission in avian species through an environmental reservoir [26–29]. Experimental studies have shown the long-term persistence of these viruses in aquatic environments [4, 30] and theoretical studies have underlined its importance in epidemiological and evolutionary outcomes [29, 25, 31, 32].

Epidemiological models of influenza viruses generally consider only a limited number of strains, perhaps as few as two [33, 31]. The host population is then classified according to their infectious status with regards to each strain. For a two strain model, for example, individuals are born susceptible to both strains (*N*_{
SS
} ) and become infected (*N*_{
IS
} or *N*_{
SI
} ) at rate *λ*_{
i
} . While infected with strain *i*, they may be exposed to strain *j* resulting in co-infection (*N*_{
II
} ), at a rate (1 - *σ*) *λ*_{
j
} where *σ* captures the extent of cross-protection. Individuals recover (*N*_{
RS
} or *N*_{
SR
} ) at rate *γ*_{
i
} and are assumed to be invulnerable to that strain.

Clearly, the force of infection (*λ*_{
i
} ) is the main driver for epidemiological and evolutionary dynamics. Direct transmission (through airborne droplets and/or via the short-term fecal/oral route) is usually modeled as a density-dependent process [24] and defined by the rate *β*_{
i
}*I*_{
i
}*S* (*I*_{1} = *N*_{
IS
} + *N*_{
II
} + *N*_{
IR
} and *I*_{2} = *N*_{
SI
} + *N*_{
II
} + *N*_{
RI
} ). For avian influenza viruses, the force of infection includes another component, intended to capture transmission via the environmental reservoir: , where *ρ* is the uptake rate, *L* is the lake volume, *κ* is the viral load needed to yield a 50% probability of infection, and *ε* is the competition term for viral particles within environment [34, 33, 29]. This competition term *ε* determines which strain is the most able to infect the susceptible individual. The viral load determining the infection rate, *V*_{
i
} , is influenced by the shedding rate of infectious individuals *ω*_{
i
} and the pathogen clearance rate in environment *ξ*_{
i
} , also termed environmental durability. The mathematical formulation is:

When a single disease system is assumed, we can derive an analytic expression for the R_{0}, a predominant measure in epidemiology measuring the expected number of infections when one infectious individual is introduced in a population totally susceptible:

where and are the contribution of direct and environmental transmission respectively [25]. When evolution is considered, *i.e*. through mutation, this framework can be extended easily to numerous strains if full cross-immunity is assumed (*i.e*. *σ* = 0 and co-infection is not possible). However, partial cross-immunity is well established for influenza viruses [3, 35]. In this case, the state space of the system increases exponentially because the infection history of individuals needs to be tracked, quickly leading to model intractability. A solution recently suggested [14] is to assume "polar immunity", which means that cross-immunity makes some of the hosts totally immune instead of infecting them. An infection is hence possible only if the host is completely susceptible to this strain. Then, it is possible to track only susceptible and infectious individuals to each strain *i*:

where *σ*_{
ij
} represents the cross-immunity network between strains, *m* is the mutation rate and *i* and *j* are the strain "identities" of the pathogens. Here, we assume that cross-immunity decreases exponentially [14] with the distance between the infection history and the challenging strain (equation 19) as it has been described for influenza viruses in humans [3] and horses [35]. This exponential decrease is shaped by the parameter *d* .

These ordinary differential equations may be converted to exact Markovian analogues using, for example, Gillespie's direct method [36, 37, 13]. To do so, first the rates of all events have to be specified (*i.e* transitions between classes, see table S1 in additional file 1). The time until the next events is computed at each iteration as follows:

where *R*_{
j
} is the rate of event *j* and *RAND* 1 is a uniform random deviate in (0, 1). Then, a new uniform random number in (0,1) is generated and multiplied by ∑ ^{n} *R*_{
j
} , denoted by *P* . The next event is then determined by ∑^{j-1}< *P* < ∑ ^{j} .