EpiFire comprises two bodies of code that are written in object-oriented C++: the applications programming interface (API) and the graphical user interface (GUI). The EpiFire GUI was developed using the API and Qt [30], and allows non-programmers to generate networks, perform epidemic simulations, and export figures and data. We describe the EpiFire GUI in more detail in the Results section below. The entire EpiFire code base is open source, licensed under GNU GPLv3.

The EpiFire API consists of 20 classes and 2,500 lines of non-whitespace code. The EpiFire GUI consists of 12 classes and 3,500 lines of non-whitespace code.

### EpiFire API

Functionally, the EpiFire API consists of tools for network generation, network manipulation, network characterization, and epidemic simulation. Programmatically, the API is divided into network, node, edge, and simulation classes. Each class defines a type of variable and its associated attributes and functions. For example, the network class allows users to define a network variable, which can contain one or more node variables that can be connected by one or more edge variables. In the small program below, an undirected network called my_network is created, and then populated with 100 nodes. The nodes are randomly connected with edges such that on average, each node will be connected to five others. Finally, the structure of the network is written out as an edgelist in the comma-separated-value format.

#include <
Network.h>

int main() {

Network my_network("example network", Network::Undirected );

my_network.populate(100);

my_network.rand_connect_poisson(5);

my_network.write_edgelist("output.csv");

return 0;

}

The network constructor takes two arguments: an arbitrary text string naming the network, and either Network::Undirected or Network::Directed, which specifies whether all edges are undirected, or some or all may be directional.

Each time the program is run, a different randomly connected graph will be produced. The following is an example of the beginning of the output file:

0,97

0,17

0,21

1

2,49

2,51

2,36

2,73

2,66

3,45

In this case, Node 0 was connected to Nodes 97, 17, and 21. Node 1 was not connected to any others.

More sophisticated examples, including networks being used in epidemic simulations, can be found in Additional file 1 in the examples directory provided with the source code.

The network modeling portions of the code (the Network, Node, and Edge classes) can be used with or without the epidemiological code, and may therefore be useful for non-epidemiology applications. The simulation classes provided include three types of finite, stochastic epidemic simulations: percolation and chain-binomial (both network-based), and mass-action. Users may use the provided simulation classes or may create derived classes based on them. For example, the base class for percolation simulations, called Percolation_Sim assumes a disease with susceptible-infectious-recovered states. A simple derived simulation class can be created that inherits almost all the functionality of Percolation_Sim, but that uses an alternate progression of states. An example of a derived simulation using the susceptible-exposed-infectious-recovered state progression (SEIR_Percolation_Sim.h) can be found in the research directory provided with the source code.

Networks may be constructed explicitly by reading in an edgelist file, or adding individual nodes and specifying their connections. Networks can also be constructed implicitly by using one of the network generators provided. Generators for ring and square lattice networks are provided, as well as three random network generators: the Erdős-Rényi model [31], resulting in approximately Poisson degree distributions, the configuration model [32] that generates random networks with a user-specified degree distribution, 'and the Watts-Strogatz “small-world” network generation model [33].

Networks that are generated via the configuration model can contain edges that are usually undesirable in epidemiological models. Pairs of nodes may be randomly connected by two or more edges, and nodes may be “connected” to themselves by edges going to and from the same node. These edges, called parallel edges and self-loops respectively, may be removed using the provided “lose-loops” function (Additional file 1: Appendix B). This function uses a novel algorithm to reconnect the affected edges in a randomized way that preserves the degree sequence of the network. This approach may introduce some non-randomness to the network structure, but the improvement in algorithmic complexity over competing methods is significant [34].

Random numbers are generated using the Mersenne Twister algorithm [35] as implemented by Wagner, available at http://www-personal.umich.edu/~wagnerr/MersenneTwister.html.

### Percolation and chain-binomial pseudocode

EpiFire provides epidemic simulators using the percolation and chain-binomial models, represented as pseudocode below. Both pseudocode functions take a network as an argument and return final epidemic size. The most recent implementations, including additional functions for the simulators, can be found online [36, 37]. In the percolation pseudocode below, *T* denotes the transmissibility of the pathogen, that is, the probability that transmission will occur between an infectious node and a susceptible neighbor.

**Percolation(**
network
**,**
T
**):**

*infected_queue* ← empty list

**foreach**
*node*
**in**
*network*
:

set state of
*node*
to "susceptible"

*first_infected* ← random node from
*network*

set state of
*first_infected*
to "infectious"

append
*first_infected*
to
*infected_queue*

**while**
*infected_queue*
is not empty:

*node* ← remove first element from
*infected_queue*

**foreach**
*neighbor*
of
*node*
:

*rand* ← uniform random number between 0 and 1

**if**
*neighbor*
is "susceptible"
**and**
*rand* < *T*
:

set state of
*neighbor*
to "infectious"

append
*neighbor*
to
*infected_queue*

set state of
*node*
to "recovered"

*epidemic_size* ← count of nodes in "recovered" state

**return**
*epidemic_size*

Appendix B2 of Additional file 1 provides a version of the percolation algorithm that produces an epidemic curve. In practice, it may be convenient to use integers as node states rather than text strings. In the chain binomial algorithm below, susceptible nodes have a value of 0, recovered nodes have a value of -1, and infectious nodes have a value equal to the number of days they have been infectious.

Appendix B3 of Additional file 1 provides a simple chain binomial function that performs one comparison per time unit per infectious node. Here, we describe a more efficient implementation. Instead of checking whether transmission occurs to each neighbor at each time step, we can determine the time until transmission along each edge. Because each transmission attempt can be considered a Bernoulli trial, we can determine when transmission will occur by sampling from a truncated geometric distribution with probability of "success" *T_cb* (chain binomial transmissibility) and support on {1, 2, . . . , gamma + 1}, where gamma is the infectious period. If the deviate happens to be gamma + 1, then transmission never occurs. In the pseudocode below, *transQ* is a priority queue of transmission events, sorted by time, least to greatest.

**Chain_binomial(**
network
**,**
T_cb, gamma
**):**

*transQ* ← empty priority queue of [time, node] pairs

*infected_list* ← empty list

**foreach**
*node*
**in**
*network*
:

set state of
*node*
to 0

*current_time* ← 0

*first_infected* ← random node from
*network*

Infect_node(
*current_time*
,
*first_infected*
)

**while**
*infected_list*
is not empty:

**foreach**
*node*
**in**
*infected_list*
:

increment state of
*node*

**while**
*infected_list*
is not empty:

**if**
state of
*infected_list*
[0] ≤ *gamma*
:

**break**

**else**
:

set state of
*infected_list*
[0] to -1

remove first element from
*infected_list*

**while**
*transQ*
is not empty and time of
*transQ*
[0] ≤ *time*
:

*event* ← transQ[0]

Infect_node(time of
*event*
, node of
*event*
,
*T_cb*
,
*gamma*
,
*transQ*
,
*infected_list*
)

*epidemic_size* ← count of nodes in -1 state

**return**
*epidemic_size*

**Infect_node(**
current_time, node
**,**
T_cb
**,**
gamma, transQ, infected_list
**):**

set state of
*node*
to 1

append
*node*
to
*infected_list*

**foreach**
*neighbor*
of
*node*
:

**if**
state of
*neighbor*
is 0:

*rand* ← geometric_random_number(
*T_cb*
,
*gamma*
), see main text

**if**
*rand* ≤ *gamma*
:

append [
*current_time* + *rand*
,
*neighbor*
] to
*transQ*

### Analytic calculations of epidemic and network quantities

Given a degree distribution for a network and a transmissibility for a pathogen, the EpiFire API includes functions that calculate the expected epidemic threshold for the network (the critical transmissibility above which epidemics are possible), the basic reproductive rate of the pathogen in that network (*R*
_{
0
}). EpiFire GUI further calculates expected epidemic size under network and mass-action assumptions. All of the network calculations assume the configuration network model, such that the network is a random draw from all randomly connected networks with the specified degree distribution. Calculations, unless otherwise noted, are adapted from Meyers (2007) [6], which provides additional mathematical details.

The epidemic threshold for a network is a critical transmission probability (along edges) below which outbreaks are expected to fizzle out and above which large epidemics are possible, but not guaranteed. Technically, in an infinite network, outbreaks below the epidemic threshold will reach only a finite number of nodes, while outbreaks above the threshold can either be finite or infect a fraction of the network including an infinite number of nodes. This value is a function of the network structure and corresponds exactly to an

*R*
_{0} value of 1; given by

where *k* is the degree of a node, and *p*
_{
k
} is the fraction of nodes having degree *k*.

The expected basic reproductive rate is the expected number of neighbors that will be infected by each infectious node early in an epidemic, and is equal to the ratio of the actual transmissibility to the critical transmissibility, given by

The expected epidemic size is then given by

where

*u* is the solution to the self-consistency equation

We also provide a function that calculates the expected final epidemic size in a mass-action model, given a value of

*R*
_{0}[

1]:

where *S*
_{0} is the fraction of individuals who are susceptible at the start of the epidemic. The expected epidemic sizes under both the mass action and network models are solved numerically using the bisection method [38].

By calculating and comparing the network and mass-action expectations for an epidemic size of a specific network-pathogen combination (done automatically in the EpiFire GUI), one can assess the epidemiological impact of the network structure. Large differences in the values of network and mass-action expectations suggest that network structure plays an important role in disease transmission, and that traditional compartmental models may not be adequate.

Since percolation and chain binomial transmissibilities are per-time-unit and per-infectious-period probabilities, respectively, when users switch between simulation types the transmissibility parameter is recalculated accordingly.

One important property of networks is clustering, a measure of whether nodes exist in well-interconnected groups. EpiFire implements the transitivity clustering coefficient [

39], calculated as

where triangles is the number of sets of nodes A, B, and C such that all three are interconnected, and triples is the number of sets of nodes A’, B’, and C’ such that B’ is connected to A’ and C’.